FreedomBox Mediawiki Fail2ban filter
Created by Steven Baltakatei Sandoval on 2023-12-28T15:37-08 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-12-31T21:36-08.
Summary
I noticed a significant increase in CPU usage on a public FreedomBox
webserver I run for publishing my notes via a MediaWiki instance at
reboil.com/mediawiki. The high usage was caused by frequent expensive
requests for dyanmically generated special pages. I implemented two
solutions: modifying the server's robots.txt
and creating a
fail2ban
filter.
Table of Contents
Background
Note that this procedure may likely only work for FreedomBox instances
since FreedomBox itself makes automatic configuration file
changes. For example, fail2ban
should already be installed and
maintained by FreedomBox.
FreedomBox is a Debian package that converts the machine it is installed on into a personal cloud server. Specifically, it converts the machine into an Apache server with a webUI interface for installing apps such as WordPress (blog), Mediawiki (wiki), Bepasty (file sharing), Ejabberd (XMPP chat server), Postfix/Dovecot (Email server), OpenVPN, Radicale (calendar and addressbook), among others. See the Manual for details.
All files are created or edited with root
access by a FreedomBox account with administrator privileges by
logging in via ssh
and running:
$ sudo su -
#
This article assumes a basic knowledge of GNU/Linux such as logging
into your FreedomBox via ssh
, editing text files via the command
line, viewing file contents with cat
, running Bash scripts, and
being aware of file ownership issues.
Analysis
I detected the high traffic usage via a journalctl resembling the following:
journalctl
# journalctl --output=short-iso --follow
A more focused command is described in the following Bash script run
as the root
user.
#!/bin/bash
journalctl --output=short-iso --follow | \
grep --line-buffered "apache-access" | \
less -S +F
Below are portions of example lines of expensive index.php?
requests.
/mediawiki/index.php?returnto=1770-03-07&returntoquery=redirect%3Dno&title=Speci
/mediawiki/index.php?target=1770-03-22&title=Special%3AWhatLinksHere HTTP/1.1" 4
/mediawiki/index.php?action=history&title=1784-03-15 HTTP/1.1" 200 5025 "-" "Moz
/mediawiki/index.php?returnto=1770-03-07&returntoquery=redirect%3Dno&title=Speci
/mediawiki/index.php?action=history&title=1784-03-15 HTTP/1.1" 200 6494 "-" "Moz
/mediawiki/index.php?action=edit&title=1770-03-08 HTTP/1.1" 200 5031 "-" "Mozill
/mediawiki/index.php?target=1862-04-02&title=Special%3AWhatLinksHere HTTP/1.1" 4
/mediawiki/index.php?action=edit&title=1770-03-08 HTTP/1.1" 200 4911 "-" "Mozill
top
High CPU usage was indicated by the appearance of multiple
php-fpm7.4
processes indicated by the top
command, a task manager
available on most Unix-like operating systems such as Debian 12.
dstat
dstat was a system performance monitoring utility that I was fond
of. Although FreedomBox uses a Red Hat version which took over the
dstat
namespace, and replaced it with a rewritten version that lacks
the handy --top-cpu
option, the following command should still work
to show you relevant CPU information, outputting averaged data in a
line every 60 seconds:
# dstat --time --load --proc --cpu --mem --disk --io --net --sys --vm 60
dool
dool
is the python3 compatible fork of dstat
that isn't tracked by
Debian but which recreates the dstat behavior I'm used to such as
including the --top-cpu
option. You can install it into local user
space via:
# git clone https://github.com/scottchiefbaker/dool.git dool
# cd dool
# ./install.py
You are root, doing a local install
Installing binaries to /usr/bin/
Installing plugins to /usr/share/dool/
Installing manpages to /usr/share/man/man1/
Install complete. Dool installed to /usr/bin/dool
You can then run the command via:
# dool --time --load --proc --cpu --top-cpu --mem --disk --io --net --sys --vm 60
Installing for use by the
Methodology
robots.txt
According to the MediaWiki Manual for robots.txt, requests from
webcrawlers to index.php may be disallowed by adding the following
text to a server's robots.txt
file. In my particular FreedomBox
instance, the file is located at /var/www/html/robots.txt
. I use the
cat
command merely to show the contents and location of the file for
this explanation.
# cat /var/www/html/robots.txt
User-agent: *
Disallow: /mediawiki/index.php?
No restart to the apache2
service should be necessary.
index.php
is the main access point for a MediaWiki site. In my
FreedomBox installation, MediaWiki pages are served by default at URLs
omitting index.php
. For example, my article on the Moon is served by
default at https://reboil.com/mediawiki/Moon. Notably,
https://reboil.com/mediawiki/index.php?title=Moon also works, but it
will trigger the fail2ban
filter described below.
This modification of robots.txt
alone is an indirect way to reduce
web crawler requests for dynamically generated MediaWiki pages but it
relies on coöperation from webcrawlers themselves to honor the
Disallow request. The following fail2ban
filter is an active
response that bans IP addresses that make repeated requests.
fail2ban filter
Fail2ban is a program used by default with a FreedomBox instance. A generic tutorial for configuring it is available here.
For my instance, I set up the filter by creating two files and running
a systemd
command to restart the fail2ban
service.
The first file to create sets up a filter for fail2ban
that uses a
regular expression to identify requests for index.php
. It is a
3-line file named mediawiki.conf
saved in /etc/fail2ban/filter.d/
.
# cat /etc/fail2ban/filter.d/mediawiki.conf
[Definition]
failregex = <HOST> -.*"GET /mediawiki/index\.php\?.*"
ignoreregex =
The second file is a jail.local
which references the
mediawiki.conf
file (via the [mediawiki]
line) and specifies the
trigger and reset conditions for an IP address ban.
# cat /etc/fail2ban/jail.local
[mediawiki]
enabled = true
port = http,https
filter = mediawiki
logpath = %(apache_error_log)s
maxretry = 60
findtime = 600
bantime = 3600
The logpath
line is specific to how FreedomBox configures its logs
when applying fail2ban
to other applications. maxretry
is the
number of requests within a window of findtime
seconds that will
trigger a ban lasting bantime
seconds.
In this particular example, a webcrawler requesting 60 or more pages
via my Mediawiki's index.php
access point within a 5-minute window
will get a 1-hour ban on its IP address. Most dynamically generated
pages a typicaly human user would use are viewing a page's history or
requesting to edit a page. I find it implausible for a human to make
sixty such requests in five minutes (that's one request per ten
seconds) and so I find these limits rational.
In my instance, I had to create the jail.local
file. According to
the Linode tutorial, jail.local
is meant to permit a local
administrator to extend and override configurations established in
default .conf
files such as fail2ban.conf
.
To immediately apply the new fail2ban
filter, the service must be
restarted:
# systemctl restart fail2ban.service
Current bans can be viewed via a fail2ban-client
command:
# fail2ban-client status mediawiki
Status for the jail: mediawiki
|- Filter
| |- Currently failed: 1
| |- Total failed: 5920
| `- Journal matches:
`- Actions
|- Currently banned: 1
|- Total banned: 51
`- Banned IP list: 47.76.35.19
Relevant system information
- FreedomBox version: 23.6.2
- Operating system: Debian GNU/Linux 11 (bullseye)
San Juan College 45th Annual Luminaria photos
Created by Steven Baltakatei Sandoval on 2023-12-04T04:51+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-12-04T05:37+00.
Summary
On 2023-12-02, I attended the 45th Annual Luminarias display held by San Juan College (SJC). I took some photos which I have uploaded to a static image gallery using fgallery.
Background
Growing up in Farmington, NM, I had considered the holiday luminaria display at SJC to be a routine celebration observed everywhere. It wasnʼt until after I left the area that I realized lighting thousands of candles inside paper bags was a somewhat niche tradition. The Wikimedia Commons category Luminaries in the United States shows some photographs but none of the San Juan College event which has been going on for over 4 decades. At some point I'll upload some photos to Commons but I wanted to get some select photos uploaded to my static resource site https://reboil.com/res/.
2023-10-14 Annular Solar Eclipse
Created by Steven Baltakatei Sandoval on 2023-10-15T09:31+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-10-15T10:40+00.
(Image © Steven Baltakatei Sandoval, 🅭🅯🄎4.0)
Summary
I took a photograph of the 2023-10-14 solar eclipse from Vancouver, Washington. I uploaded a copy to Wikimedia Commons here.
Background
I got lucky and managed to get just the right amount of cloudcover to take a photo without any special equipment besides my Sony a7 III with a 70-300mm telephoto lens.
I had known about the eclipse for a few days, having not really tracked it in my notes since it was not a total solar eclipse. However, some family members let me know it was happening and I noticed I wouldn't have to travel very far from my Vancouver, Washington, dwelling to see a significant occultation. I saw that it was scheduled to be viewable at my local morning around 2023-10-14T09:00-07. I prepared my eclipse glasses the night before.
So, when I woke up on the day of the eclipse, despite being somewhat disappointed at seeing an overcast sky, I went outside a few minutes before maximum occlusion. I found my roommates were already outside looking to the southeast at a bright spot in the clouds where the sun was. The morning clouds were clearing. I saw that some variation in cloudcover would allow me to see the eclipse without any glasses. I also saw that it was possible for me to use the cloud cover to take a photo with my camera. I quickly retrieved my camera and snapped a few photos when a dark cloud passed by. The result allowed me to clearly see the crescent shape of the partial solar eclipse.
I then downloaded my photos to a computer, cropped one of the better images, then uploaded it to my Mastodon account, Wikimedia Commons, and my website. Some commenters said the image looked like it could be an album cover; I told them anyone was free to use it under the CC BY-SA 4.0 license.
To be honest, there are many higher quality photographs than mine. In particular, I like this set by a Ross A. Whitley, taken from the San Jose Mission) in San Antonio, Texas.
(Image © Ross A. Whitley, 🅭🅮1.0)
References
- Ross A. Whitley. (2023-10-14). “Annular Eclipse 2023”. Flickr. Accessed 2023-10-14.
- Steven Baltakatei Sandoval. (2023-10-14). “2023-10-14 solar eclipse from Vancouver, WA”. Wikimedia Commons. Accessed (2023-10-14).
fgallery example
Created by Steven Baltakatei Sandoval on 2023-07-07T01:40+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-07-07T03:41+00.
Summary
I wanted to test out a static HTML+Javascript photo gallery package I found on Debian to answer a question on a Lemmy thread about self-hosted photo galleries. The end result was this example gallery.
Setup procedure
On a Debian 11 system, the following commands may be used to create an
fgallery
image gallery.
Install fgallery
$ sudo apt update;
$ sudo apt install fgallery;
Install optional tools
$ sudo apt install jpegoptim pngcrush 7zip;
Build a gallery from photos in photo-src/
.
#!/bin/bash
din=./photo-src ;
dout=./test4 ;
title="Tanabata at Portland Japanese Gardens, July 2022 - Steven Baltakatei Sandoval";
fgallery -c txt -s -j4 --index="https://reboil.com/mediawiki/2022-07-07" \
"$din" "$dout" \
"$title" && \
python3 -m http.server -d "$dout";
The input directory path is set in din
, the output directory path is
set in dout
.
The title of the webUI page is set in title
.
The -c txt
option tells fgallery
to look for captions for an image
in a text file named exactly the same as the image file except that
the final extension (e.g. .jpg
of photo-src/foo.jpg
) is replaced
with .txt
(e.g. photo-src/foo.txt
).
The -s
option prevents the original image files from being included
as an archive in the output and disables downloading of such an
archive.
The -j4
option specifies that up to 4 image processing jobs may be
performed in parallel in order to speed up creation of the output
directory contents if at least 4 CPUs are available.
The --index
option specifies the URL for the "Back" button at the
top left of the gallery webUI.
The &&
specifies the next command python3
should only run if the
fgallery
command does not fail.
The \
at the end of some lines tells bash
to ignore the newline
(so longer commands in the script are broken up to be more readable).
The python3 -m http.server -d "$dout"
command creates a simple local
HTTP server at the address http://localhost:8000
so a web browser
can view the webUI without there being a need to upload anything to
any remote server.
References
Migration to Lemmy
Created by Steven Baltakatei Sandoval on 2023-06-15T10:05+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-06-15T18:27+00.
On 2023-06-12, after I noticed a surge of moderators setting major subreddits private, I decided to explore federated alternatives to Reddit which I had heard existed. After some searches, I found mentions of Lemmy and Kbin.
I checked through a list of Lemmy instances and created a spreadsheet factoring in admin age, uptime, and domain name length. I used that to generate a short list to let me review the post history of admins. After this review, I decided that the Lemmy instance sopuli.xyz was the most appropriate for me; I was looking for a smaller instance with at least some months of history. Sopuli.xyz seems to have been started by a Finn who set rules against posts about QAnon, Nazis, and other bigoted groups I don't want to associate myself with. So, I applied and, as of 2023-06-14,now have an account there.
I now look forward to contributing comments to a more decentralized forum.
Usada Pekora BGM animation synchronization
Created by Steven Baltakatei Sandoval on 2023-06-09T21:35+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-06-09T23:14+00.
Summary
I got annoyed at the lack of a properly synchronized animation of a Pekora Usada walking animation on YouTube, so I created one. I wrote a Bash script to assemble frames generated by FFmpeg which I then imported into Kdenlive in order to combine with an audio loop edited in Audacity. It took a fair amount of work so I thought I'd write up what I did in TeXmacs.
Links
- YouTube
- Git repository (Git bundle)
- Write-up (PDF)
- Wiki entry.
Meta
This was project BK-2023-04
.
ISO 10628 symbol drawing update
Created by Steven Baltakatei Sandoval on 2023-05-24T07:30+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-05-31T19:55+00.
Edit(2023-05-31):Add reboil.com wiki links.
Summary
On 2023-05-23, I updated a drawing I made in 2020-09 to serve as a palette for chemical engineering symbols I might use when constructing my own P&IDs.
Background
Back in 2020-09, I became interested in creating P&ID diagrams for use
in my personal notes and in the DeVoe's Thermodynamics and Chemistry
transcription project of mine (BK-2021-07). I decided to find an
industry consensus standard set of symbols so that my drawings (which
I planned to license CC BY-SA 4.0 for use in Wikipedia and Wikimedia
Commons) could be used by other people. Therefore, I purchased a copy
of ISO 10628-2:2012
and manually drew each symbol in Inkscape.
On 2020-09-25, I uploaded a set of drawings showing all the ISO
10628-2:2012
symbols to Wikimedia Commons (See BK-2020-04-PID-1-SHT1
SVG file). I split the upload into several separate SVG files, due to
lack of multi-page support for SVGs in Inkscape. I also uploaded a set
of PDF files exported from Inkscape since the ISO 3098
font I chose
for the drawing (osifont in the Debian repository) wasn't supported by
Wikimedia Commons.
In 2020-02, I received an email from a John Kunicek about the symbol
numbering system used in ISO 10628
symbols I drew in
BK-2020-04-PID-1
. They also informed me of some spelling errors. On
2023-02-27, I reviewed Kunicek's questions and came to the conclusion
that basically the symbol numbers in ISO 10628
followed a pattern
established in another ISO standard called ISO 14617
of which ISO
14617-1:2005
is an index of registration numbers for symbols used in
other ISO standards such as ISO 10628. On 2023-02-28 I sent my reply
to Kunicek and updated the BK-2020-04-PID-1
legends to address the
ambiguities of the original legend that was limited to what
information was provided in my copy of ISO 10628:2012
.
On 2020-03-01, while I had the files on my mind, I also decided to
edit the SVG files of BK-2020-04-PID-1
so that each symbol and its
associated text objects containing ISO 10628 descriptions and
registration numbers were grouped together; this update would enable
someone to easily copy and paste individual symbols if they edited the
drawing in Inkscape or, if a machine were performing a text search of
the body of the SVG file itself, they could quickly find the
registration numbers to help them identify the associated symbols
nearby in the XML tree of the SVG file. Previously, the symbol objects
and text objects were mixed together in a hodgepodge; symbols and
their associated text objects were only obviously related to a human
looking at the rendered image. I didn't push the updated SVG files
since I wanted to wait and see if any more corrections or questions
came from Kunicek or others.
On 2021-05-08, Wikipedia editor Lonaowna added thumbnails of the SVGs I uploaded to the ISO 10628 Wikipedia article. I myself hadn't wanted to do so since I felt it would have been a conflict of interest and rather self-promotional to push edits containing links of my self-published works.
On 2023-05-23, I decided to go ahead and upload to Wikimedia Commons
the updates of the BK-2020-04-PID-1
SVG files containing the
corrections and clarifications I applied in 2020-02/2020-03. I also
made some adjustments to text placement since, annoyingly, Wikimedia
Commons doesn't have an ISO 3098
compliant font for use in technical
drawing SVG files; PNG previews of the uploaded SVGs showed text
converted into a generic sans serif font that is 25% wider than that
used by osifont. After some adjustments and reuploads, I ended up
with satisfying SVG and PDF versions uploaded to Wikimedia
Commons. Again, here is a link to the first sheet of
BK-2020-04-PID-1
; the description contains links to the other PDFs
and SVGs for all the sheets.
Motivation
A significant amount of legwork is involved in creating a reference
drawing when none meeting your criteria exists already. My criteria is
that works I publish should be compatible with a Creative Commons
BY-SA 4.0 license; creating BK-2020-04-PID-1
, a palette of ISO 10628
symbols, was part of that process for me. I hope that the work I put
in helps save other people time which they can dedicate to leisure. I
believe leisurely people are the most capable of creative works. I
would prefer to live in a world where projects such as Wikipedia and
Wikimedia Commons can expand in scope to cover specialized
disciplines, saving people time from reinventing wheels.
Even if my uploaded work is hoovered up by an AI and spit out mixed with others, my carefully crafted drawings will remain causally upstream of the process; I think AI language models such as ChatGPT will lubricate on-demand knowledge downloads for the public; the quality of those downloads is dependent upon the quality of the data set the language models are trained on. I'm okay with this process and hope to find other like-minded people who are willing to make a living making such contributions to common knowledge without the politics of worrying about using copyright to protect trade secrets.
Diné Bizaad Bínáhooʼaah Notes
Created by Steven Baltakatei Sandoval on 2023-02-01T09:31+00 under a CC BY-SA 4.0 license and last updated on 2023-05-31T19:10+00.
Edit(2023-05-31):Add reboil.com wiki links.
Background
In 2023-01, I decided to purchase a copy of "Diné Bizaad Bínáhooʼaah = Rediscovering the Navajo Language" to aid me in my studies of the Navajo language. I had tried out the Navajo lessons of Duolingo and found them problematic when it came to anything more complex than memorizing vocabulary (especially regarding verb conjugations).
So, as I read through it, I will record notes on this web page that I think other readers may find useful.
Stats
- Title: Diné Bizaad Bínáhooʼaah = Rediscovering the Navajo language : an introduction to the Navajo language
- Authors:
- Evangeline Parsons Yazzie
- Margaret Speas
- Editors:
- Jessie Ruffenach
- Berlyn Yazzie (Navajo)
- ISBN: 978-1-893354-73-9
- OCLC: 156845819
- Edition: 1st
- Printing: 3rd
- Publisher: Salina Bookshelf, Inc.
- Location: Flagstaff, Arizona
By page
Page xvii
The following hyperlink:
http://www.swarthmore.edu/SocSci/ifernal1/nla/halearch/halearch.htm
is not valid as of 2023-02-01. Searching pages under the
swarthmore.edu
domain yields this page which likely contains the
material referenced (i.e. "If you are not sure how this can be done
for Navajo, we suggest that you consult the materials on Situational
Navajo, by Wayne Holm, Irene Silentman and Laura Wallace, available
for download…"):
https://fernald.domains.swarthmore.edu/nla/halearch/halearch.htm
This page and one level of outlinks has been saved via the Internet Archive here.
Page 3
The consonant ʼ
The glyph used in the text to encode the consonant named "glottal stop" appears to be the glyph that is MODIFIER LETTER APOSTROPHE (U+02BC) or RIGHT SINGLE QUOTATION MARK (U+2019) in Unicode.
However, due to widespread input method limitations, the ASCII character APOSTROPHE (U+0027) is often used instead.
The text addresses this:
You probably wonder why an apostrophe has been added to the list above. The letter that looks like an apostrphe is called a glottal stop. A glottal stop is a consonant. We will talk about the glottal stop in the section below on consonants.
In Navajo, the glottal stop is a consonant in the same class as
k
orx
which each have their own dedicated glyphs. A rational typesetter would not use MULTIPLICATION SIGN (U+00D7) (×) instead of LATIN SMALL LETTER X (U+0078) (x) even though both use similar glyphs.So, the question arises of whether to use MODIFIER LETTER APOSTROPHE (U+02BC) or RIGHT SINGLE QUOTATION MARK (U+2019).
Regarding the difference, the Unicode Standard 15.0 (PDF) has this to say in its General Punctuation section of Writing Systems and Punctuation:
Apostrophes
U+0027 apostrophe is the most commonly used character for apostrophe. For historical reasons, U+0027 is a particularly overloaded character. In ASCII, it is used to represent a punctuation mark (such as right single quotation mark, left single quotation mark, apos- trophe punctuation, vertical line, or prime) or a modifier letter (such as apostrophe modi- fier or acute accent). Punctuation marks generally break words; modifier letters generally are considered part of a word.
When text is set, U+2019 right single quotation mark is preferred as apostrophe, but only U+0027 is present on most keyboards. Software commonly offers a facility for auto- matically converting the U+0027 apostrophe to a contextually selected curly quotation glyph. In these systems, a U+0027 in the data stream is always represented as a straight ver- tical line and can never represent a curly apostrophe or a right quotation mark.
Letter Apostrophe. U+02BC modifier letter apostrophe is preferred where the apostrophe is to represent a modifier letter (for example, in transliterations to indicate a glottal stop). In the latter case, it is also referred to as a letter apostrophe.
Punctuation Apostrophe. U+2019 right single quotation mark is preferred where the character is to represent a punctuation mark, as for contractions: “We’ve been here before.” In this latter case, U+2019 is also referred to as a punctuation apostrophe.
An implementation cannot assume that users’ text always adheres to the distinction between these characters. The text may come from different sources, including mapping from other character sets that do not make this distinction between the letter apostrophe and the punctuation apostrophe/right single quotation mark. In that case, all of them will generally be represented by U+2019.
The semantics of U+2019 are therefore context dependent. For example, if surrounded by letters or digits on both sides, it behaves as an in-text punctuation character and does not separate words or lines.
So, according to its standard, the apropriate Unicode character to use for glottal stops in the Navajo language is MODIFIER LETTER APOSTROPHE (U+02BC) (ʼ).
Before 2023-02-02, I've recommended use of RIGHT SINGLE QUOTATION MARK (U+2019) (’) primarily as a means to get away from using the "overloaded" character APOSTROPHE (U+0027) where reasonable. However, going forward, I'm now recommending U+02BC instead.
Input methods designed for the Navajo language should dedicate an entire key to MODIFIER LETTER APOSTROPHE (U+02BC) (ʼ) as it would for the ASCII letter LATIN SMALL LETTER K (U+006B) (k).
In summary,
ʼ
is the glottal stop consonant, not'
.
See Also
Wikipedia articles exist for the authors:
Libro.fm Recommendation
Created by Steven Baltakatei Sandoval on 2023-01-04T15:26+00 under a CC BY-SA 4.0 license and last updated on 2023-05-31T19:09+00.
Edit(2023-05-31):Add reboil.com wiki links.
Background
I enjoy listening to audiobooks. I first began listening to them regularly in 2010 upon my return to Stanford University after serving a 2-year mission for the LDS Church in Panamá. The iPhone had come out while I had been out of the country (I still remember seeing my first iPhone at an electronics shop in David, Chiriquí; I was amazed by how reponsive the operating system was to touch screen input when resizing photographs via the novel "pinch and zoom" mechanic.); I didn't purchase an iPhone immediately (I think I would use a fliphone until purchasing an iPhone from AT&T after I left college), but I did purchase with my allowance from my father an iPod Touch which was basically an iPhone without a SIM card slot. I bring up the iPod Touch story because I believe I used its portability and Audible–iTunes integration allowed me to listen to audiobooks while away from my desktop computer. Audible was the first company I purchased audiobooks from; I would continue using it until 2022.
Before I left Audible
I listened to Audible's audiobooks for about 12 years (2010/2022); these were encrypted by Digital Rights Management (DRM) schemes that inhibited copying. I had not yet learned the importance of using Free/Libre Open Source Software (FLOSS) formats (I wouldn't stop regularly using Windows until I purchased my first dedicated Debian GNU/Linux workstation from Think Penguin in 2018). Therefore, I spent thousands of USD over time buying audiobooks. Audible's feature of allowing me to download and listen to audiobooks indefinitely (from their servers and using only their closed-source apps) kept me satisfied. Even today, in 2023, I'm fairly certain I could install the Audible app from the Google Play store and download every audiobook I have "purchased" from them.
I believe my first misgivings about using Audible were when I realized in transitioning to using FLOSS that I couldn't listen to my audiobooks. In 2018 I would have had to use my Android smartphone or my Windows machine since Audible published their software for use on those platforms. There is no official Audible player in the Debian repository. I can't open an encrypted Audible file in ffmpeg on my Debian machine to compress it; I'd have to use a janky daisy chain of audio inputs and output devices to even be able to do automatic speech transcription in case I wanted to search what I was listening to at a later time. Still, that wasn't enough activation energy to get me to leave Audible until 2022.
The thing that triggered my departure was something mundane: a billing misunderstanding. At some point I had failed to realize that my Audible credits did not roll over from year-to-year. In the beginning, I didn't realize that such a policy existed since I generally used up my platinum subscription credits immediately, especially when I drove a long commute during 2011/2018 in the mostly featureless landscape of southern Utah. After I resigned from my commuting job and I wasn't forcing myself to drive two hours a day (nearly 10% of my life) anymore, I found myself listening to Audible audiobooks less. By the time 2022 rolled around, I hadn't checked my Audible app in months. It was in late 2022-03 that I realized that my credits regularly had been expiring instead of accumulating. A background daydream that I would one day buy a long audiobook series on Audible all at once was dispelled. I decided to leave.
The interm
I decided that if I were to return, it would be if I could guarantee my audiobooks were DRM-free. For some months in 2022 I subsisted on podcasts such as Opening Arguments (for law explanation by a lawyer), Citation Needed (for comedic takes on various Wikipedia articles) and Security Now (to be aware of IT-specific news). In the past I knew that it was possible to download audiobooks directly from authors if authors took the effort to do so; for example, in late 2020, I purchased DRM-free copies of Cory Doctorow's books Radicalized (2019; WorldCat) and Attack Surface (2020; WorldCat), paying him via PayPal and receiving a download link to DRM-free zip files containing unencrypted audio files. A friend recommended I use AudioAnchor, an F-Droid app designed to facilitate audiobook listening on an Android phone; it worked great. However, Cory Doctorow is only a single author; I wanted a DRM-free audiobook vendor.
Libro.fm: My new audiobook source
In late 2022 I discovered Libro.fm via a blog post by Cory Doctorow on boingboing.net talking about how Google launched a DRM-free audiobook store. In background that he provided, I latched onto some DRM-free audiobook store recommendations that he made, including Downpour and Libro.fm. I poked around both Downpour and Libro.fm and found that I liked Libro.fm best. I bought How To, by Randall Munroe, and Klara and the Sun, by Kazuo Ishiguro.
Since then, I've purchased various titles including:
- What We Owe the Future by William MacAskill
- The Silver Ships series (minus the first book since that's an Audible exclusive, but it's pulp sci-fi so, no book is really that critical to the entertainment)
- American Crusade by Andrew L. Seidel
- What If? 2 by Randall Munroe
- Educated by Tara Westover (from Obama's 2019 summer reading list)
- Seveneves (a book I already purchased on Audible back in 2015 but I really wanted a copy I could preserve)
- Artemisa por Andy Weir (spanish version of Artemis)
- El marciano por Andy Weir (spanish version of The Martian; Andy Weir's works in english seem to be Audible exclusives, so those two years walking around Panama didn't go completely to waste =P)
- Proyecto Hail Mary (spanish version of Project Hail Mary)
- NPCs by Drew Hayes (some Dungeons and Dragons-themed comedy)
I noticed that Libro.fm lacks the selection of Audible. For example, it doesn't carry my favorite Terry Pratchett novel Small Gods (1992) but it does carry recent titles of his such as Snuff (2011) and The Shepherd's Crown (2015).
Aside: DRM piracy
I imagine the main reason why Audible chooses to restrict access to their audiobooks via DRM is: piracy. Some people, when they get their hands on an unencrypted digital file, share it with others. Digital copies can be manufactured at basically zero cost but commercial publishers like Audible grew rich on profit margins on production and distribution costs; books had mass which incurred costs upon which a percentage fee could be applied at the final sale; when the distribution cost fell to zero, instead of becoming like Apple and the music industry in 2006 and simply selling songs at 0.99 USD each, they chose to require customers to run secret software that would decrypt books at the point of consumption. That isn't to say that all music Apple sold wasn't locked by DRM; many were. But the point of my retelling this history is to point out that DRM is not required to make money.
Services such as Libro.fm sell audiobooks without DRM. No special software is required to play the audio. It's true that I could upload these files to some server and share them with my friends. However, what I think keeps most people from doing so are issues of trust and effort. Downloading and double-clicking on files you download from the internet is a fast way for the average user to corrupt their computer with malware. A sort of natural selection process of behaviors is at work. Behaviors that result in broken computers due to downloading and running files from unknown sources are seen as destructive and the sites involved avoided. Behaviors that result in non-broken computers and a simple high quality experience are seen as good. Some people dedicate time to master the esoteric computer science techniques of verifying cryptographic digests, preserving their anonymity via onion routing, maintaining a firewall around their home networks, and regularly updating their software with the latest security updates; these people can be effective pirates. However, with all those skills they can also become effective software developers and make money that they can spend at places like Libro.fm or Downpour.com to save themselves the trouble of having to bypass DRM restrictions in the first place. The real valuable service DRM-free audiobook vendors can provide is two parts:
- Files are guaranteed to be available for fast download.
- Files are guaranteed not to be malicious.
With piracy to safely avoid DRM media, a user might expect to spend anywhere between an hour to weeks identifying and downloading media that might be a trojan horse. With DRM-free vendors, a user can expect to spend a few minutes with a commercial guarantee of the product's authenticity. When you use Audible, you form an on-going contract that Audible can end at any time, resulting in your "purchases" becoming unusable noise. When you use Libro.fm, Libro.fm can't retroactively make files they sold me unusable; without DRM, there is no mechanism for controlling user behavior. A principle of Free/Libre Open Source Software is the avoidance of such methods of control in order to grant the user freedom.
Conclusion
Although lacking in selection, Libro.fm surpasses Audible in the fact that money I spend with them results in audiobooks that I can preserve forever without worrying about finding an app to verify I have a license to download some decryption key. This is why I'm redirecting my cash flow towards DRM-free vendors.
Copyright
"A tower of used books - 8443" by Jorge Royan is licensed under CC BY-SA 3.0.
Inactive on Twitter
Created by Steven Baltakatei Sandoval on 2022-11-10T20:14+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-06-03T12:23+00.
UPDATE (2023-02-01): I think I finally managed to delete all my tweets and likes from Twitter via TweetDelete. Some previous attempts didn't quite clear everything from 2017 and earlier. I've been enjoying using my twit.social/@baltakatei account with the Tusky app via F-Droid.
UPDATE (2022-11-23): My new microblogging feed is at twit.social/@baltakatei , one of many Mastodon servers. My last Twitter post is an announcement of this migration. I chose twit.social since it is operated by Leo Laporte, the host of several podcasts and television shows I have listened to in the past and found trustworthy as far as communicating technology news. I still listen regularly to his and Steve Gibson's Security Now podcast.
Edit(2023-05-31/2023-06-03):Add reboil.com wiki links.
I decided to not be active on the microblogging site Twitter after Elon Musk completed his purchase of the publicly traded social media company and promptly fired the CEO and dissolved the board of directors, making himself the only director. I had developed some trust of its original CEO, Jack Dorsey, back when Twitter had been the subject of discussion on Leo Laporte's This Week in Tech podcast in the last 00s. In the 2010s I decided that I would be okay publishing text on Twitter because from the get-go the site explained that what was submitted would be public; in contrast, Facebook (which I deactivated back in the early 2010s, long before Zuckerberg renamed it "Meta"), advertised privacy settings that would allow posts to be only shared with a limited number of contacts (and with Facebook employees); however, the privacy settings were complex and there didn't seem to be a default setting that would stick over time. So, Twitter's transparently public nature seems more honest. My posts would be available and there was no sign that the administrators of the site favored any particular political party; the most common reason I saw for Tweets being removed was due to threats of violence or harassment. Prior to 2022, posts to Twitter could be relied upon to remain unfiltered, provided you weren't threatening violence or spreading misinformation.
That changed in 2022 when I saw Elon Musk purchase the company, making the service his own privately owned property. Now, were I to continue to post to Twitter, I was making a public donation to Musk that he could choose to throw away like he did the company leaders that he fired. That in itself may not have been a dealbreaker for me, but he also proceeded to endorse the Republican Party which continues to rely upon the criminal President who organized the attempted coup of the United States of 2021-01-06. His tweet removed any doubt that he would turn Twitter into a tool to promote the Republican Party. Privileged mechanisms to promote his own political opinions at the expense of silencing others by leveraging his exclusive ownership of Twitter include:
- Removing user-submitted content that criticize him (as he has banned users for adopting his name and image in protest).
- Removing features from his critics (as Congresswoman Alexandria Ocasio-Cortez reported).
I admit that many people are firmly rooted in habit to use Twitter as their default social media space to remain connected to eachother. Choosing to leave Twitter for another space risks losing contact with people who have not yet left. Habitual use of Twitter is like a gravity well that requires a significant activation energy of its inhabitants to escape. However, I stand by my decision for reasons similar to those that compelled me to leave Facebook: I can no longer assume what I post will be secure from censorship.
So, what is my social media space? Without Twitter, Reddit is my default. I'd like to make use of this blog more often, although I will need to figure out a more convenient way to post content Currently, my process is:
- Author posts in Emacs Org mode.
- Export posts into Markdown text.
- Commit the Markdown text to a git repo.
- Push the commit to my reboil.com server.
- Wait for an update script to run or log into the server to run it manually.
I could probably automate all that to a single Emacs function or bash script, given enough time, in order to mimic the simplicity of microblogging. However, for now, these longer form posts satisfy me for now.
This blog is powered by ikiwiki.
Copyright © 2020—2023 Steven Baltakatei Sandoval (PGP: 0xA0A295ABDC3469C9
). Text is available under the Creative Commons Attribution-ShareAlike license (🅭🅯🄎4.0); additional terms may apply.