RSS Atom Add a new post titled:

FreedomBox Mediawiki Fail2ban filter

Created by Steven Baltakatei Sandoval on 2023-12-28T15:37-08 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-12-31T21:36-08.

Summary

I noticed a significant increase in CPU usage on a public FreedomBox webserver I run for publishing my notes via a MediaWiki instance at reboil.com/mediawiki. The high usage was caused by frequent expensive requests for dyanmically generated special pages. I implemented two solutions: modifying the server's robots.txt and creating a fail2ban filter.

Table of Contents

  1. FreedomBox Mediawiki Fail2ban filter
    1. Summary
    2. Background
    3. Analysis
      1. journalctl
      2. top
      3. dstat
      4. dool
    4. Methodology
      1. robots.txt
      2. fail2ban filter
      3. Relevant system information

Background

Note that this procedure may likely only work for FreedomBox instances since FreedomBox itself makes automatic configuration file changes. For example, fail2ban should already be installed and maintained by FreedomBox.

FreedomBox is a Debian package that converts the machine it is installed on into a personal cloud server. Specifically, it converts the machine into an Apache server with a webUI interface for installing apps such as WordPress (blog), Mediawiki (wiki), Bepasty (file sharing), Ejabberd (XMPP chat server), Postfix/Dovecot (Email server), OpenVPN, Radicale (calendar and addressbook), among others. See the Manual for details.

All files are created or edited with root access by a FreedomBox account with administrator privileges by logging in via ssh and running:

$ sudo su -
#

This article assumes a basic knowledge of GNU/Linux such as logging into your FreedomBox via ssh, editing text files via the command line, viewing file contents with cat, running Bash scripts, and being aware of file ownership issues.

Analysis

I detected the high traffic usage via a journalctl resembling the following:

journalctl

# journalctl --output=short-iso --follow

A more focused command is described in the following Bash script run as the root user.

#!/bin/bash
journalctl --output=short-iso --follow | \
  grep --line-buffered "apache-access" | \
  less -S +F

Below are portions of example lines of expensive index.php? requests.

/mediawiki/index.php?returnto=1770-03-07&returntoquery=redirect%3Dno&title=Speci
/mediawiki/index.php?target=1770-03-22&title=Special%3AWhatLinksHere HTTP/1.1" 4
/mediawiki/index.php?action=history&title=1784-03-15 HTTP/1.1" 200 5025 "-" "Moz
/mediawiki/index.php?returnto=1770-03-07&returntoquery=redirect%3Dno&title=Speci
/mediawiki/index.php?action=history&title=1784-03-15 HTTP/1.1" 200 6494 "-" "Moz
/mediawiki/index.php?action=edit&title=1770-03-08 HTTP/1.1" 200 5031 "-" "Mozill
/mediawiki/index.php?target=1862-04-02&title=Special%3AWhatLinksHere HTTP/1.1" 4
/mediawiki/index.php?action=edit&title=1770-03-08 HTTP/1.1" 200 4911 "-" "Mozill

top

High CPU usage was indicated by the appearance of multiple php-fpm7.4 processes indicated by the top command, a task manager available on most Unix-like operating systems such as Debian 12.

dstat

dstat was a system performance monitoring utility that I was fond of. Although FreedomBox uses a Red Hat version which took over the dstat namespace, and replaced it with a rewritten version that lacks the handy --top-cpu option, the following command should still work to show you relevant CPU information, outputting averaged data in a line every 60 seconds:

# dstat --time --load --proc --cpu --mem --disk --io --net --sys --vm 60

dool

dool is the python3 compatible fork of dstat that isn't tracked by Debian but which recreates the dstat behavior I'm used to such as including the --top-cpu option. You can install it into local user space via:

# git clone https://github.com/scottchiefbaker/dool.git dool
# cd dool
# ./install.py 
You are root, doing a local install

Installing binaries to /usr/bin/
Installing plugins  to /usr/share/dool/
Installing manpages to /usr/share/man/man1/

Install complete. Dool installed to /usr/bin/dool

You can then run the command via:

# dool --time --load --proc --cpu --top-cpu --mem --disk --io --net --sys --vm 60

Installing for use by the

Methodology

robots.txt

According to the MediaWiki Manual for robots.txt, requests from webcrawlers to index.php may be disallowed by adding the following text to a server's robots.txt file. In my particular FreedomBox instance, the file is located at /var/www/html/robots.txt. I use the cat command merely to show the contents and location of the file for this explanation.

# cat /var/www/html/robots.txt
User-agent: *
Disallow: /mediawiki/index.php?

No restart to the apache2 service should be necessary.

index.php is the main access point for a MediaWiki site. In my FreedomBox installation, MediaWiki pages are served by default at URLs omitting index.php. For example, my article on the Moon is served by default at https://reboil.com/mediawiki/Moon. Notably, https://reboil.com/mediawiki/index.php?title=Moon also works, but it will trigger the fail2ban filter described below.

This modification of robots.txt alone is an indirect way to reduce web crawler requests for dynamically generated MediaWiki pages but it relies on coöperation from webcrawlers themselves to honor the Disallow request. The following fail2ban filter is an active response that bans IP addresses that make repeated requests.

fail2ban filter

Fail2ban is a program used by default with a FreedomBox instance. A generic tutorial for configuring it is available here.

For my instance, I set up the filter by creating two files and running a systemd command to restart the fail2ban service.

The first file to create sets up a filter for fail2ban that uses a regular expression to identify requests for index.php. It is a 3-line file named mediawiki.conf saved in /etc/fail2ban/filter.d/.

# cat /etc/fail2ban/filter.d/mediawiki.conf 
[Definition]
failregex = <HOST> -.*"GET /mediawiki/index\.php\?.*"
ignoreregex =

The second file is a jail.local which references the mediawiki.conf file (via the [mediawiki] line) and specifies the trigger and reset conditions for an IP address ban.

# cat /etc/fail2ban/jail.local
[mediawiki]
enabled = true
port = http,https
filter = mediawiki
logpath  = %(apache_error_log)s
maxretry = 60
findtime = 600
bantime = 3600

The logpath line is specific to how FreedomBox configures its logs when applying fail2ban to other applications. maxretry is the number of requests within a window of findtime seconds that will trigger a ban lasting bantime seconds.

In this particular example, a webcrawler requesting 60 or more pages via my Mediawiki's index.php access point within a 5-minute window will get a 1-hour ban on its IP address. Most dynamically generated pages a typicaly human user would use are viewing a page's history or requesting to edit a page. I find it implausible for a human to make sixty such requests in five minutes (that's one request per ten seconds) and so I find these limits rational.

In my instance, I had to create the jail.local file. According to the Linode tutorial, jail.local is meant to permit a local administrator to extend and override configurations established in default .conf files such as fail2ban.conf.

To immediately apply the new fail2ban filter, the service must be restarted:

# systemctl restart fail2ban.service

Current bans can be viewed via a fail2ban-client command:

# fail2ban-client status mediawiki
Status for the jail: mediawiki
|- Filter
|  |- Currently failed: 1
|  |- Total failed: 5920
|  `- Journal matches:  
`- Actions
   |- Currently banned: 1
   |- Total banned: 51
   `- Banned IP list:   47.76.35.19

Relevant system information

  • FreedomBox version: 23.6.2
  • Operating system: Debian GNU/Linux 11 (bullseye)
Posted 2023-12-29T02:47:03+0000

San Juan College 45th Annual Luminaria photos

Created by Steven Baltakatei Sandoval on 2023-12-04T04:51+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-12-04T05:37+00.

img

Summary

On 2023-12-02, I attended the 45th Annual Luminarias display held by San Juan College (SJC). I took some photos which I have uploaded to a static image gallery using fgallery.

Background

Growing up in Farmington, NM, I had considered the holiday luminaria display at SJC to be a routine celebration observed everywhere. It wasnʼt until after I left the area that I realized lighting thousands of candles inside paper bags was a somewhat niche tradition. The Wikimedia Commons category Luminaries in the United States shows some photographs but none of the San Juan College event which has been going on for over 4 decades. At some point I'll upload some photos to Commons but I wanted to get some select photos uploaded to my static resource site https://reboil.com/res/.

Posted 2023-12-04T05:37:54+0000

2023-10-14 Annular Solar Eclipse

Created by Steven Baltakatei Sandoval on 2023-10-15T09:31+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-10-15T10:40+00.

A photo of the 2023-10-14 annular solar eclipse taken by Ross A. Whitley from San Antonio, Texas.

(Image © Steven Baltakatei Sandoval, 🅭🅯🄎4.0)

Summary

I took a photograph of the 2023-10-14 solar eclipse from Vancouver, Washington. I uploaded a copy to Wikimedia Commons here.

Background

I got lucky and managed to get just the right amount of cloudcover to take a photo without any special equipment besides my Sony a7 III with a 70-300mm telephoto lens.

I had known about the eclipse for a few days, having not really tracked it in my notes since it was not a total solar eclipse. However, some family members let me know it was happening and I noticed I wouldn't have to travel very far from my Vancouver, Washington, dwelling to see a significant occultation. I saw that it was scheduled to be viewable at my local morning around 2023-10-14T09:00-07. I prepared my eclipse glasses the night before.

So, when I woke up on the day of the eclipse, despite being somewhat disappointed at seeing an overcast sky, I went outside a few minutes before maximum occlusion. I found my roommates were already outside looking to the southeast at a bright spot in the clouds where the sun was. The morning clouds were clearing. I saw that some variation in cloudcover would allow me to see the eclipse without any glasses. I also saw that it was possible for me to use the cloud cover to take a photo with my camera. I quickly retrieved my camera and snapped a few photos when a dark cloud passed by. The result allowed me to clearly see the crescent shape of the partial solar eclipse.

I then downloaded my photos to a computer, cropped one of the better images, then uploaded it to my Mastodon account, Wikimedia Commons, and my website. Some commenters said the image looked like it could be an album cover; I told them anyone was free to use it under the CC BY-SA 4.0 license.

To be honest, there are many higher quality photographs than mine. In particular, I like this set by a Ross A. Whitley, taken from the San Jose Mission) in San Antonio, Texas.

A photo of the 2023-10-14 annular solar eclipse taken by Ross A. Whitley from San Antonio, Texas.

(Image © Ross A. Whitley, 🅭🅮1.0)

References

Posted 2023-10-15T10:32:11+0000

fgallery example

Created by Steven Baltakatei Sandoval on 2023-07-07T01:40+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-07-07T03:41+00.

Example gallery.

Summary

I wanted to test out a static HTML+Javascript photo gallery package I found on Debian to answer a question on a Lemmy thread about self-hosted photo galleries. The end result was this example gallery.

Setup procedure

On a Debian 11 system, the following commands may be used to create an fgallery image gallery.

Install fgallery

$ sudo apt update;
$ sudo apt install fgallery;

Install optional tools

$ sudo apt install jpegoptim pngcrush 7zip;

Build a gallery from photos in photo-src/.

#!/bin/bash
din=./photo-src ;
dout=./test4 ;
title="Tanabata at Portland Japanese Gardens, July 2022 - Steven Baltakatei Sandoval";
fgallery -c txt -s -j4 --index="https://reboil.com/mediawiki/2022-07-07" \
"$din" "$dout" \
"$title" && \
python3 -m http.server -d "$dout";

The input directory path is set in din, the output directory path is set in dout.

The title of the webUI page is set in title.

The -c txt option tells fgallery to look for captions for an image in a text file named exactly the same as the image file except that the final extension (e.g. .jpg of photo-src/foo.jpg) is replaced with .txt (e.g. photo-src/foo.txt).

The -s option prevents the original image files from being included as an archive in the output and disables downloading of such an archive.

The -j4 option specifies that up to 4 image processing jobs may be performed in parallel in order to speed up creation of the output directory contents if at least 4 CPUs are available.

The --index option specifies the URL for the "Back" button at the top left of the gallery webUI.

The && specifies the next command python3 should only run if the fgallery command does not fail.

The \ at the end of some lines tells bash to ignore the newline (so longer commands in the script are broken up to be more readable).

The python3 -m http.server -d "$dout" command creates a simple local HTTP server at the address http://localhost:8000 so a web browser can view the webUI without there being a need to upload anything to any remote server.

References

Posted 2023-07-07T01:53:37+0000

Migration to Lemmy

Created by Steven Baltakatei Sandoval on 2023-06-15T10:05+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-06-15T18:27+00.

img

On 2023-06-12, after I noticed a surge of moderators setting major subreddits private, I decided to explore federated alternatives to Reddit which I had heard existed. After some searches, I found mentions of Lemmy and Kbin.

I checked through a list of Lemmy instances and created a spreadsheet factoring in admin age, uptime, and domain name length. I used that to generate a short list to let me review the post history of admins. After this review, I decided that the Lemmy instance sopuli.xyz was the most appropriate for me; I was looking for a smaller instance with at least some months of history. Sopuli.xyz seems to have been started by a Finn who set rules against posts about QAnon, Nazis, and other bigoted groups I don't want to associate myself with. So, I applied and, as of 2023-06-14,now have an account there.

I now look forward to contributing comments to a more decentralized forum.

Posted 2023-06-15T12:03:07+0000

Usada Pekora BGM animation synchronization

Created by Steven Baltakatei Sandoval on 2023-06-09T21:35+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-06-09T23:14+00.

img

Summary

I got annoyed at the lack of a properly synchronized animation of a Pekora Usada walking animation on YouTube, so I created one. I wrote a Bash script to assemble frames generated by FFmpeg which I then imported into Kdenlive in order to combine with an audio loop edited in Audacity. It took a fair amount of work so I thought I'd write up what I did in TeXmacs.

Links

Meta

This was project BK-2023-04.

Posted 2023-06-09T22:25:18+0000

ISO 10628 symbol drawing update

Created by Steven Baltakatei Sandoval on 2023-05-24T07:30+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-05-31T19:55+00.

img

Edit(2023-05-31):Add reboil.com wiki links.

Summary

On 2023-05-23, I updated a drawing I made in 2020-09 to serve as a palette for chemical engineering symbols I might use when constructing my own P&IDs.

Background

Back in 2020-09, I became interested in creating P&ID diagrams for use in my personal notes and in the DeVoe's Thermodynamics and Chemistry transcription project of mine (BK-2021-07). I decided to find an industry consensus standard set of symbols so that my drawings (which I planned to license CC BY-SA 4.0 for use in Wikipedia and Wikimedia Commons) could be used by other people. Therefore, I purchased a copy of ISO 10628-2:2012 and manually drew each symbol in Inkscape.

On 2020-09-25, I uploaded a set of drawings showing all the ISO 10628-2:2012 symbols to Wikimedia Commons (See BK-2020-04-PID-1-SHT1 SVG file). I split the upload into several separate SVG files, due to lack of multi-page support for SVGs in Inkscape. I also uploaded a set of PDF files exported from Inkscape since the ISO 3098 font I chose for the drawing (osifont in the Debian repository) wasn't supported by Wikimedia Commons.

In 2020-02, I received an email from a John Kunicek about the symbol numbering system used in ISO 10628 symbols I drew in BK-2020-04-PID-1. They also informed me of some spelling errors. On 2023-02-27, I reviewed Kunicek's questions and came to the conclusion that basically the symbol numbers in ISO 10628 followed a pattern established in another ISO standard called ISO 14617 of which ISO 14617-1:2005 is an index of registration numbers for symbols used in other ISO standards such as ISO 10628. On 2023-02-28 I sent my reply to Kunicek and updated the BK-2020-04-PID-1 legends to address the ambiguities of the original legend that was limited to what information was provided in my copy of ISO 10628:2012.

On 2020-03-01, while I had the files on my mind, I also decided to edit the SVG files of BK-2020-04-PID-1 so that each symbol and its associated text objects containing ISO 10628 descriptions and registration numbers were grouped together; this update would enable someone to easily copy and paste individual symbols if they edited the drawing in Inkscape or, if a machine were performing a text search of the body of the SVG file itself, they could quickly find the registration numbers to help them identify the associated symbols nearby in the XML tree of the SVG file. Previously, the symbol objects and text objects were mixed together in a hodgepodge; symbols and their associated text objects were only obviously related to a human looking at the rendered image. I didn't push the updated SVG files since I wanted to wait and see if any more corrections or questions came from Kunicek or others.

On 2021-05-08, Wikipedia editor Lonaowna added thumbnails of the SVGs I uploaded to the ISO 10628 Wikipedia article. I myself hadn't wanted to do so since I felt it would have been a conflict of interest and rather self-promotional to push edits containing links of my self-published works.

On 2023-05-23, I decided to go ahead and upload to Wikimedia Commons the updates of the BK-2020-04-PID-1 SVG files containing the corrections and clarifications I applied in 2020-02/2020-03. I also made some adjustments to text placement since, annoyingly, Wikimedia Commons doesn't have an ISO 3098 compliant font for use in technical drawing SVG files; PNG previews of the uploaded SVGs showed text converted into a generic sans serif font that is 25% wider than that used by osifont. After some adjustments and reuploads, I ended up with satisfying SVG and PDF versions uploaded to Wikimedia Commons. Again, here is a link to the first sheet of BK-2020-04-PID-1; the description contains links to the other PDFs and SVGs for all the sheets.

Motivation

A significant amount of legwork is involved in creating a reference drawing when none meeting your criteria exists already. My criteria is that works I publish should be compatible with a Creative Commons BY-SA 4.0 license; creating BK-2020-04-PID-1, a palette of ISO 10628 symbols, was part of that process for me. I hope that the work I put in helps save other people time which they can dedicate to leisure. I believe leisurely people are the most capable of creative works. I would prefer to live in a world where projects such as Wikipedia and Wikimedia Commons can expand in scope to cover specialized disciplines, saving people time from reinventing wheels.

Even if my uploaded work is hoovered up by an AI and spit out mixed with others, my carefully crafted drawings will remain causally upstream of the process; I think AI language models such as ChatGPT will lubricate on-demand knowledge downloads for the public; the quality of those downloads is dependent upon the quality of the data set the language models are trained on. I'm okay with this process and hope to find other like-minded people who are willing to make a living making such contributions to common knowledge without the politics of worrying about using copyright to protect trade secrets.

Posted 2023-05-24T10:48:41+0000

Diné Bizaad Bínáhooʼaah Notes

Created by Steven Baltakatei Sandoval on 2023-02-01T09:31+00 under a CC BY-SA 4.0 license and last updated on 2023-05-31T19:10+00.

Edit(2023-05-31):Add reboil.com wiki links.

Background

In 2023-01, I decided to purchase a copy of "Diné Bizaad Bínáhooʼaah = Rediscovering the Navajo Language" to aid me in my studies of the Navajo language. I had tried out the Navajo lessons of Duolingo and found them problematic when it came to anything more complex than memorizing vocabulary (especially regarding verb conjugations).

So, as I read through it, I will record notes on this web page that I think other readers may find useful.

Stats

  • Title: Diné Bizaad Bínáhooʼaah = Rediscovering the Navajo language : an introduction to the Navajo language
  • Authors:
    • Evangeline Parsons Yazzie
    • Margaret Speas
  • Editors:
    • Jessie Ruffenach
    • Berlyn Yazzie (Navajo)
  • ISBN: 978-1-893354-73-9
  • OCLC: 156845819
  • Edition: 1st
  • Printing: 3rd
  • Publisher: Salina Bookshelf, Inc.
  • Location: Flagstaff, Arizona

By page

Page xvii

The following hyperlink:

http://www.swarthmore.edu/SocSci/ifernal1/nla/halearch/halearch.htm

is not valid as of 2023-02-01. Searching pages under the swarthmore.edu domain yields this page which likely contains the material referenced (i.e. "If you are not sure how this can be done for Navajo, we suggest that you consult the materials on Situational Navajo, by Wayne Holm, Irene Silentman and Laura Wallace, available for download…"):

https://fernald.domains.swarthmore.edu/nla/halearch/halearch.htm

This page and one level of outlinks has been saved via the Internet Archive here.

Page 3

  1. The consonant ʼ

    The glyph used in the text to encode the consonant named "glottal stop" appears to be the glyph that is MODIFIER LETTER APOSTROPHE (U+02BC) or RIGHT SINGLE QUOTATION MARK (U+2019) in Unicode.

    However, due to widespread input method limitations, the ASCII character APOSTROPHE (U+0027) is often used instead.

    The text addresses this:

    You probably wonder why an apostrophe has been added to the list above. The letter that looks like an apostrphe is called a glottal stop. A glottal stop is a consonant. We will talk about the glottal stop in the section below on consonants.

    In Navajo, the glottal stop is a consonant in the same class as k or x which each have their own dedicated glyphs. A rational typesetter would not use MULTIPLICATION SIGN (U+00D7) (×) instead of LATIN SMALL LETTER X (U+0078) (x) even though both use similar glyphs.

    So, the question arises of whether to use MODIFIER LETTER APOSTROPHE (U+02BC) or RIGHT SINGLE QUOTATION MARK (U+2019).

    Regarding the difference, the Unicode Standard 15.0 (PDF) has this to say in its General Punctuation section of Writing Systems and Punctuation:

    Apostrophes

    U+0027 apostrophe is the most commonly used character for apostrophe. For historical reasons, U+0027 is a particularly overloaded character. In ASCII, it is used to represent a punctuation mark (such as right single quotation mark, left single quotation mark, apos- trophe punctuation, vertical line, or prime) or a modifier letter (such as apostrophe modi- fier or acute accent). Punctuation marks generally break words; modifier letters generally are considered part of a word.

    When text is set, U+2019 right single quotation mark is preferred as apostrophe, but only U+0027 is present on most keyboards. Software commonly offers a facility for auto- matically converting the U+0027 apostrophe to a contextually selected curly quotation glyph. In these systems, a U+0027 in the data stream is always represented as a straight ver- tical line and can never represent a curly apostrophe or a right quotation mark.

    Letter Apostrophe. U+02BC modifier letter apostrophe is preferred where the apostrophe is to represent a modifier letter (for example, in transliterations to indicate a glottal stop). In the latter case, it is also referred to as a letter apostrophe.

    Punctuation Apostrophe. U+2019 right single quotation mark is preferred where the character is to represent a punctuation mark, as for contractions: “We’ve been here before.” In this latter case, U+2019 is also referred to as a punctuation apostrophe.

    An implementation cannot assume that users’ text always adheres to the distinction between these characters. The text may come from different sources, including mapping from other character sets that do not make this distinction between the letter apostrophe and the punctuation apostrophe/right single quotation mark. In that case, all of them will generally be represented by U+2019.

    The semantics of U+2019 are therefore context dependent. For example, if surrounded by letters or digits on both sides, it behaves as an in-text punctuation character and does not separate words or lines.

    So, according to its standard, the apropriate Unicode character to use for glottal stops in the Navajo language is MODIFIER LETTER APOSTROPHE (U+02BC) (ʼ).

    Before 2023-02-02, I've recommended use of RIGHT SINGLE QUOTATION MARK (U+2019) (’) primarily as a means to get away from using the "overloaded" character APOSTROPHE (U+0027) where reasonable. However, going forward, I'm now recommending U+02BC instead.

    Input methods designed for the Navajo language should dedicate an entire key to MODIFIER LETTER APOSTROPHE (U+02BC) (ʼ) as it would for the ASCII letter LATIN SMALL LETTER K (U+006B) (k).

    In summary, ʼ is the glottal stop consonant, not '.

See Also

Wikipedia articles exist for the authors:

Posted 2023-02-01T10:04:37+0000

Libro.fm Recommendation

Created by Steven Baltakatei Sandoval on 2023-01-04T15:26+00 under a CC BY-SA 4.0 license and last updated on 2023-05-31T19:09+00.

img

Edit(2023-05-31):Add reboil.com wiki links.

Background

I enjoy listening to audiobooks. I first began listening to them regularly in 2010 upon my return to Stanford University after serving a 2-year mission for the LDS Church in Panamá. The iPhone had come out while I had been out of the country (I still remember seeing my first iPhone at an electronics shop in David, Chiriquí; I was amazed by how reponsive the operating system was to touch screen input when resizing photographs via the novel "pinch and zoom" mechanic.); I didn't purchase an iPhone immediately (I think I would use a fliphone until purchasing an iPhone from AT&T after I left college), but I did purchase with my allowance from my father an iPod Touch which was basically an iPhone without a SIM card slot. I bring up the iPod Touch story because I believe I used its portability and Audible–iTunes integration allowed me to listen to audiobooks while away from my desktop computer. Audible was the first company I purchased audiobooks from; I would continue using it until 2022.

Before I left Audible

I listened to Audible's audiobooks for about 12 years (2010/2022); these were encrypted by Digital Rights Management (DRM) schemes that inhibited copying. I had not yet learned the importance of using Free/Libre Open Source Software (FLOSS) formats (I wouldn't stop regularly using Windows until I purchased my first dedicated Debian GNU/Linux workstation from Think Penguin in 2018). Therefore, I spent thousands of USD over time buying audiobooks. Audible's feature of allowing me to download and listen to audiobooks indefinitely (from their servers and using only their closed-source apps) kept me satisfied. Even today, in 2023, I'm fairly certain I could install the Audible app from the Google Play store and download every audiobook I have "purchased" from them.

I believe my first misgivings about using Audible were when I realized in transitioning to using FLOSS that I couldn't listen to my audiobooks. In 2018 I would have had to use my Android smartphone or my Windows machine since Audible published their software for use on those platforms. There is no official Audible player in the Debian repository. I can't open an encrypted Audible file in ffmpeg on my Debian machine to compress it; I'd have to use a janky daisy chain of audio inputs and output devices to even be able to do automatic speech transcription in case I wanted to search what I was listening to at a later time. Still, that wasn't enough activation energy to get me to leave Audible until 2022.

The thing that triggered my departure was something mundane: a billing misunderstanding. At some point I had failed to realize that my Audible credits did not roll over from year-to-year. In the beginning, I didn't realize that such a policy existed since I generally used up my platinum subscription credits immediately, especially when I drove a long commute during 2011/2018 in the mostly featureless landscape of southern Utah. After I resigned from my commuting job and I wasn't forcing myself to drive two hours a day (nearly 10% of my life) anymore, I found myself listening to Audible audiobooks less. By the time 2022 rolled around, I hadn't checked my Audible app in months. It was in late 2022-03 that I realized that my credits regularly had been expiring instead of accumulating. A background daydream that I would one day buy a long audiobook series on Audible all at once was dispelled. I decided to leave.

The interm

I decided that if I were to return, it would be if I could guarantee my audiobooks were DRM-free. For some months in 2022 I subsisted on podcasts such as Opening Arguments (for law explanation by a lawyer), Citation Needed (for comedic takes on various Wikipedia articles) and Security Now (to be aware of IT-specific news). In the past I knew that it was possible to download audiobooks directly from authors if authors took the effort to do so; for example, in late 2020, I purchased DRM-free copies of Cory Doctorow's books Radicalized (2019; WorldCat) and Attack Surface (2020; WorldCat), paying him via PayPal and receiving a download link to DRM-free zip files containing unencrypted audio files. A friend recommended I use AudioAnchor, an F-Droid app designed to facilitate audiobook listening on an Android phone; it worked great. However, Cory Doctorow is only a single author; I wanted a DRM-free audiobook vendor.

Libro.fm: My new audiobook source

In late 2022 I discovered Libro.fm via a blog post by Cory Doctorow on boingboing.net talking about how Google launched a DRM-free audiobook store. In background that he provided, I latched onto some DRM-free audiobook store recommendations that he made, including Downpour and Libro.fm. I poked around both Downpour and Libro.fm and found that I liked Libro.fm best. I bought How To, by Randall Munroe, and Klara and the Sun, by Kazuo Ishiguro.

Since then, I've purchased various titles including:

  • What We Owe the Future by William MacAskill
  • The Silver Ships series (minus the first book since that's an Audible exclusive, but it's pulp sci-fi so, no book is really that critical to the entertainment)
  • American Crusade by Andrew L. Seidel
  • What If? 2 by Randall Munroe
  • Educated by Tara Westover (from Obama's 2019 summer reading list)
  • Seveneves (a book I already purchased on Audible back in 2015 but I really wanted a copy I could preserve)
  • Artemisa por Andy Weir (spanish version of Artemis)
  • El marciano por Andy Weir (spanish version of The Martian; Andy Weir's works in english seem to be Audible exclusives, so those two years walking around Panama didn't go completely to waste =P)
  • Proyecto Hail Mary (spanish version of Project Hail Mary)
  • NPCs by Drew Hayes (some Dungeons and Dragons-themed comedy)

I noticed that Libro.fm lacks the selection of Audible. For example, it doesn't carry my favorite Terry Pratchett novel Small Gods (1992) but it does carry recent titles of his such as Snuff (2011) and The Shepherd's Crown (2015).

Aside: DRM piracy

I imagine the main reason why Audible chooses to restrict access to their audiobooks via DRM is: piracy. Some people, when they get their hands on an unencrypted digital file, share it with others. Digital copies can be manufactured at basically zero cost but commercial publishers like Audible grew rich on profit margins on production and distribution costs; books had mass which incurred costs upon which a percentage fee could be applied at the final sale; when the distribution cost fell to zero, instead of becoming like Apple and the music industry in 2006 and simply selling songs at 0.99 USD each, they chose to require customers to run secret software that would decrypt books at the point of consumption. That isn't to say that all music Apple sold wasn't locked by DRM; many were. But the point of my retelling this history is to point out that DRM is not required to make money.

Services such as Libro.fm sell audiobooks without DRM. No special software is required to play the audio. It's true that I could upload these files to some server and share them with my friends. However, what I think keeps most people from doing so are issues of trust and effort. Downloading and double-clicking on files you download from the internet is a fast way for the average user to corrupt their computer with malware. A sort of natural selection process of behaviors is at work. Behaviors that result in broken computers due to downloading and running files from unknown sources are seen as destructive and the sites involved avoided. Behaviors that result in non-broken computers and a simple high quality experience are seen as good. Some people dedicate time to master the esoteric computer science techniques of verifying cryptographic digests, preserving their anonymity via onion routing, maintaining a firewall around their home networks, and regularly updating their software with the latest security updates; these people can be effective pirates. However, with all those skills they can also become effective software developers and make money that they can spend at places like Libro.fm or Downpour.com to save themselves the trouble of having to bypass DRM restrictions in the first place. The real valuable service DRM-free audiobook vendors can provide is two parts:

  • Files are guaranteed to be available for fast download.
  • Files are guaranteed not to be malicious.

With piracy to safely avoid DRM media, a user might expect to spend anywhere between an hour to weeks identifying and downloading media that might be a trojan horse. With DRM-free vendors, a user can expect to spend a few minutes with a commercial guarantee of the product's authenticity. When you use Audible, you form an on-going contract that Audible can end at any time, resulting in your "purchases" becoming unusable noise. When you use Libro.fm, Libro.fm can't retroactively make files they sold me unusable; without DRM, there is no mechanism for controlling user behavior. A principle of Free/Libre Open Source Software is the avoidance of such methods of control in order to grant the user freedom.

Conclusion

Although lacking in selection, Libro.fm surpasses Audible in the fact that money I spend with them results in audiobooks that I can preserve forever without worrying about finding an app to verify I have a license to download some decryption key. This is why I'm redirecting my cash flow towards DRM-free vendors.

Copyright

"A tower of used books - 8443" by Jorge Royan is licensed under CC BY-SA 3.0.

Posted 2023-01-04T20:53:32+0000

Inactive on Twitter

Created by Steven Baltakatei Sandoval on 2022-11-10T20:14+00 under a CC BY-SA 4.0 (🅭🅯🄎4.0) license and last updated on 2023-06-03T12:23+00.

img

UPDATE (2023-02-01): I think I finally managed to delete all my tweets and likes from Twitter via TweetDelete. Some previous attempts didn't quite clear everything from 2017 and earlier. I've been enjoying using my twit.social/@baltakatei account with the Tusky app via F-Droid.

UPDATE (2022-11-23): My new microblogging feed is at twit.social/@baltakatei , one of many Mastodon servers. My last Twitter post is an announcement of this migration. I chose twit.social since it is operated by Leo Laporte, the host of several podcasts and television shows I have listened to in the past and found trustworthy as far as communicating technology news. I still listen regularly to his and Steve Gibson's Security Now podcast.

Edit(2023-05-31/2023-06-03):Add reboil.com wiki links.

I decided to not be active on the microblogging site Twitter after Elon Musk completed his purchase of the publicly traded social media company and promptly fired the CEO and dissolved the board of directors, making himself the only director. I had developed some trust of its original CEO, Jack Dorsey, back when Twitter had been the subject of discussion on Leo Laporte's This Week in Tech podcast in the last 00s. In the 2010s I decided that I would be okay publishing text on Twitter because from the get-go the site explained that what was submitted would be public; in contrast, Facebook (which I deactivated back in the early 2010s, long before Zuckerberg renamed it "Meta"), advertised privacy settings that would allow posts to be only shared with a limited number of contacts (and with Facebook employees); however, the privacy settings were complex and there didn't seem to be a default setting that would stick over time. So, Twitter's transparently public nature seems more honest. My posts would be available and there was no sign that the administrators of the site favored any particular political party; the most common reason I saw for Tweets being removed was due to threats of violence or harassment. Prior to 2022, posts to Twitter could be relied upon to remain unfiltered, provided you weren't threatening violence or spreading misinformation.

That changed in 2022 when I saw Elon Musk purchase the company, making the service his own privately owned property. Now, were I to continue to post to Twitter, I was making a public donation to Musk that he could choose to throw away like he did the company leaders that he fired. That in itself may not have been a dealbreaker for me, but he also proceeded to endorse the Republican Party which continues to rely upon the criminal President who organized the attempted coup of the United States of 2021-01-06. His tweet removed any doubt that he would turn Twitter into a tool to promote the Republican Party. Privileged mechanisms to promote his own political opinions at the expense of silencing others by leveraging his exclusive ownership of Twitter include:

  • Removing user-submitted content that criticize him (as he has banned users for adopting his name and image in protest).
  • Removing features from his critics (as Congresswoman Alexandria Ocasio-Cortez reported).

I admit that many people are firmly rooted in habit to use Twitter as their default social media space to remain connected to eachother. Choosing to leave Twitter for another space risks losing contact with people who have not yet left. Habitual use of Twitter is like a gravity well that requires a significant activation energy of its inhabitants to escape. However, I stand by my decision for reasons similar to those that compelled me to leave Facebook: I can no longer assume what I post will be secure from censorship.

So, what is my social media space? Without Twitter, Reddit is my default. I'd like to make use of this blog more often, although I will need to figure out a more convenient way to post content Currently, my process is:

  • Author posts in Emacs Org mode.
  • Export posts into Markdown text.
  • Commit the Markdown text to a git repo.
  • Push the commit to my reboil.com server.
  • Wait for an update script to run or log into the server to run it manually.

I could probably automate all that to a single Emacs function or bash script, given enough time, in order to mimic the simplicity of microblogging. However, for now, these longer form posts satisfy me for now.

Posted 2023-01-04T15:24:51+0000

This blog is powered by ikiwiki.

Copyright © 20202023 Steven Baltakatei Sandoval (PGP: 0xA0A295ABDC3469C9). Text is available under the Creative Commons Attribution-ShareAlike license (🅭🅯🄎4.0); additional terms may apply.