Opening Signal Android Backups with signal-backup-decode
Created by Steven Baltakatei Sandoval on 2021-01-02T04:06Z under a CC BY-SA 4.0 license and last updated on 2021-01-03T02:04Z.
Summary
Today I want to document how I extracted messages and attachments from a Signal backup from my Android phone using my Debian machine. The end result is a directory containing:
- attachments (images, other files attached to Signal messages)
- avatars (profile images of contacts)
- stickers
- an SQLite 3.x database containing timestamped messages and contact information
A Hail Mary script summarizing all the commands is:
#!/bin/bash
mkdir /tmp/tempWorkDir && pushd /tmp/tempWorkDir
sudo apt install -y git rustc libsqlite3-dev libssl-dev pkg-config protobuf-compiler sqlitebrowser
git clone https://github.com/pajowu/signal-backup-decode ./localRepo
pushd ./localRepo
cargo install --features "rebuild-protobuf" --locked --path .
popd
~/.cargo/bin/signal-backup-decode signal-2021-01-01-23-59-59.backup --output-path ./output/ --password-file /tmp/sigkey.txt --output-type RAW --force
sqlitebrowser ./output/signal_backup.db &
where ./tmp/sigkey.txt
contains the 30-digit backup password
(a.k.a. "passphrase") with no spaces or newlines.
Table of Contents
- Opening Signal Android Backups with
signal-backup-decode
Background
I use the Signal messaging app for Android for end-to-end encrypted
communications. But sometimes I want to be able to archive certain
sets of these communications into my own storage system. The Signal
app itself doesn't go out of its way to provide such a
method. However, since 2018, it doesn't seem to have gone out of its
way to make its backup file format a moving target. Therefore, some
open source software projects have been available for years that are
capable of converting these backup files into useful decrypted forms
without many updates. A package I used in 2018 was Pajowu's
signal-backup-decode
program. Using some notes I took years ago I
have updated my personal procedure for decrypting Signal Android
backup files; I decided to share these notes since I haven't seen much
elsewhere describing how to compile or use this software.
I have tested these instructions on a Debian) 10 machine (specifically PureOS) and a Debian Testing (a 10/11 hybrid; a.k.a. "bullseye)/sid") machine. Although not tested, I imagine the git repository should be compilable on macOS using a different package manager (e.g. Homebrew) to install the dependencies.
These instructions aren't necessary to the desktop versions of Signal since the attachments are not encrypted. Messages must be extracted using a different procedure. As stated in the Signal subreddit, I don't think there is a simple method to extract messages from the iOS version of Signal.
Procedure
Review Installation Instructions
I used the following sources for instructions on how to form the commands I need:
- Pajowu's
signal-backup-decode
github repository. TheREADME.md
and Issue pages for this repo are the most useful. - A page describing
cargo
's--locked
compile option - A Reddit FAQ that mentions
signal-backup-decode
among other software.
Install Dependencies
On my Debian Testing machine (approximately Debian 11), I installed
some dependencies using apt
. I selected these according to this
Issue page on the git repo.
$ sudo apt install git rustc libsqlite3-dev libssl-dev pkg-config protobuf-compiler
Clone the git repository
The following command can be used to clone the git repository to a
working directory where the cargo install
command can later be run
to start the compiling process.
$ git clone https://github.com/pajowu/signal-backup-decode ./localRepo
Compile signal-backup-decode
The git repository contains source code written in the Rust language
that can be compiled by running the cargo install
command upon the
repository's working directory ./localRepo
. cargo
installs itself
as a dependency for rustc
. The commands to move into the
repository's working directory and executing cargo install
upon it
are as follows:
$ pushd ./localRepo
$ cargo install --features "rebuild-protobuf" --locked --path .
The --features
and --locked
options require some more explanation.
The git repository contains a Cargo.lock
file which may be used in
tandem with the --locked
flag in order to force cargo
to compile
signal-backup-decode
using specific versions of dependencies
(apparently called "crates") which cargo
automatically downloads
(explanation here).
The git repository README.md
also indicates, according to this
issue, that it may be necessary to rebuild a protobuf
feature that
should already be included in the repository; I kept the --features
"rebuild-protobuf"
option that apparently asks for the rebuild; it
might not be necessary.
Decrypt the Signal backup
The actual backup can be decrypted and extracted to an
automatically-generated directory at ./output/
with the following
command:
~/.cargo/bin/signal-backup-decode signal-2021-01-01-23-59-59.backup --output-path ./output/ --password-file /tmp/sigkey.txt --output-type RAW --force
04:35:40 [INFO] Output path: ./output-force/
04:35:40 [INFO] Input file: signal-2021-01-01-23-59-59.backup
04:35:40 [INFO] Database Version: 85
Bytes read: [00:00:25] [##################################################] 3.94GB/3.94GB
Read vs. written frames: [00:00:25] [##################################################] 31144/31144
The end result is the creation of the ./output
directory filled with
the following items:
attachment
avatar
preference
signal_backup.db
sticker
Note, the --force
option is required since there seems to be a file
overwrite problem (see this issue) with a file in
preference/
. Unless --force
is used, the signal_backup.db
SQLite
3.x database file is never written.
Inspect the Signal backup
The signal_backup.db
file can be opened using sqlitebrowser
:
$ sudo apt install sqlitebrowser
$ sqlitebrowser ./output/signal_backup.db
Under the "Browse Data" tab, messages are stored within the sms
table.
Conclusion
The process for decrypting backups produced by the Android version of
the Signal app hasn't changed much since 2018. The process is complex
since it involves compiling an executable binary from the rust
language as well as some familiarity with the git
version control
system. The commands described in this document may need to be altered
over time if Signal alters their backup format and the maintainer of
signal-backup-decode
updates their program's options in response.
Software Versions
Versions of software mentioned in this document are as follows:
cargo
$ apt list --installed | grep cargo
cargo/testing,now 0.47.0-3+b1 amd64 [installed,automatic]
$ cargo --version
cargo 1.46.0
git
$ apt list --installed | grep git
git/testing,now 1:2.29.2-1 amd64 [installed]
$ git --version
git version 2.29.2
libsqlite3-dev
$ apt list --installed | grep libsqlite3-dev
libsqlite3-dev/testing,now 3.34.0-1 amd64 [installed]
libssl-dev
$ apt list --installed | grep libssl-dev
libssl-dev/testing,now 1.1.1i-1 amd64 [installed]
pkg-config
$ apt list --installed | grep pkg-config
pkg-config/testing,now 0.29.2-1 amd64 [installed]
protobuf-compiler
$ apt list --installed | grep protobuf-compiler
protobuf-compiler/testing,now 3.12.3-2+b2 amd64 [installed]
rustc
$ apt list --installed | grep rustc
rustc/testing,now 1.48.0+dfsg1-2 amd64 [installed]
$ rustc --version
rustc 1.48.0
sqlitebrowser
$ apt list --installed | grep sqlitebrowser
sqlitebrowser/testing,now 3.12.1-2 amd64 [installed]
$ sqlitebrowser --version
DB Browser for SQLite Version 3.12.1
Built for x86_64-little_endian-lp64, running on x86_64
Qt Version 5.15.1
SQLite Version 3.33.0.
Chemical Engineering applications using the Bitcoin blockchain
Created by Steven Baltakatei Sandoval on 2020-12-30T23:29Z under a CC BY-SA 4.0 license and last updated on 2020-12-31T01:15Z.
Summary
I composed the following post in response to a discussion prompt in the AIChE Engage forum.
Table of Contents
Prompt
To: Discussion Central
Subject: Blockchain technology
Blockchain technology is an upcoming trend when it comes to immutable hack-proof digital technologies. Bitcoin is a classic example. I have a strong feeling that blockchain could be the next big thing since ARPANET become our INTERNET.
Anyone aware of the potential of this technology being implemented in chemical process industries (CPI)…
Ideas are welcome..
------------------------------
Rohit Korde MSc,CEng
Program Manager
Worley
Thane (W)
------------------------------
Reply
To: Discussion Central
Subject: RE: Blockchain technology
I run my own Bitcoin node and maintained a Lightning channel before. I'm not as familiar with Ethereum or other proof-of-stake blockchains beyond knowing that they gain scalability and flexibility over Bitcoin (which uses proof-of-work) at the cost of decentralization. In contrast to its token price volatility, the dominant (in terms of market capitalization) Bitcoin software changes less frequently over time. Therefore, some ideas I have regarding applications may prove significant in the chemical process industries. Although I use Bitcoin as an example, none of these ideas requires involvement of any particular commercial entity or blockchain, so long as a permission-less blockchain is used (i.e. one based on proof-of-work). These ideas are influenced by time working as a Plant Engineer / MOC Coordinator for a small independent oil and gas company's carbon dioxide reinjection plant in southern Utah.
Idea 1. Historian archive integrity
The idea is to prove the existence of process data using the Bitcoin blockchain. This may be accomplished by software that periodically (e.g. every 3 hours) constructs and submits a bitcoin transaction (e.g. OPRETURN) containing the merkle root of a hash tree. Each leaf of the tree is the cryptographic digest (i.e. "hash", e.g. SHA256) of a small file.
Take, for example, a situation in which each leaf is a file containing the time, equipment tag, and instrument reading of a gas composition transmitter. Already there exist public APIs known as calendar servers which accept files submitted through a website or Python script. In theory a process plant's historian could extend a calendar server's merkle tree by creating a local hash tree every 3 hours from many instrumentation readings across the entire facility and then submitting as a file this tree's merkle root. The plant's historian software could then maintain a database containing these trees. Linking individual instrument readings in this way is useful because it would then be possible to prove with a hard cryptographic guarantee that particular sets of historian data existed at (or after) a certain time. Additionally, the prover (e.g. a process engineer with access to the merkle tree database) need not provide entire trees to a verifier (e.g. an emissions compliance consultant); only a "merkle branch" need be sent for each individual leaf of the tree, reducing data bandwidth and storage requirements. Each merkle branch consists of a set of cryptographic hashes of cryptographic hashes pointing from the embedded bitcoin transaction (immutable) to a gas composition reading. The novelty of this arrangement derives from two facts:
- The Bitcoin blockchain did not exist until
2009-01-03T18:15:05+00:00
. - The Bitcoin blockchain is timestamped and immutable.
The bottom line is that the Bitcoin blockchain permits creation of proofs that specific data existed in the past. This may be useful for investigating/preventing rewriting of history that two parties may disagree about (e.g. process data of a costly upset, fraudulent mill test reports).
Idea 2. Micropayment-facilitated process miniaturization
The idea is use blockchain micropayments to increase economic viability of expensive process intensification that reduces the size of a process plant. Cost savings comes from reducing the time (and overhead costs) between an operations service company (e.g. equipment maintenance, feedstock suppliers) performing a service and receiving payment for that service. Additional revenue can be obtained by installing small plants in locations where a larger one is not cost effective. For example, crude oil pumped from a marginal well in a remote location today likely must be transported through a pipeline to a distant refinery where unit operations may be applied to produce petrol products useful to the human population near the wells. Due to distance, the pipeline may prove too uneconomic to construct or maintain. If a well owner installed a miniaturized refinery at each individual well in order to bypass the need of a pipeline, the mundane issue of the monthly billing cycle appears. A contract service company may judge a single small well-refinery operation as not large enough to justify allocating accountant time to process checks. In theory, existing financial institutions equipped with modern computers have the capability to facilitate micropayments but most efforts have floundered.
In this example, the mini-refinery could directly pay services providing maintenance labor and expendables (ex: TEG, lubricant, refrigerant, etc.) through the Lightning Network (LN). For a summary of how LN works with Bitcoin, I refer you to the background section of this 2020 Scientific Reports paper. The bottom line is that the mini-refinery could purchase service technician time or sell dispensed product with transactions carrying fees as low as 0.00000001 BTC (0.00029 USD) per transaction.
With such low transaction fees, sale of a product stream could be performed nearly continuously; with the refinery's LN channel's bitcoin balance state updated on a per-minute basis based on flowmeter readings. A purchase order for technician time could be issued and approved automatically for every minute the technician is on site. Only when a refinery's LN channel balance becomes lopsided (e.g. with its own earnings) will the refinery owner have to submit anything to the blockchain. The channel can be closed at any time by the owner or their LN channel partner since every LN transaction is also a valid Bitcoin transaction. A customer receiving product (e.g. gasoline) need not have a channel with the well owner in order to purchase product, so long as a path of channels exists in the Lightning Network connecting the customer to the owner.
The well-refinery in this example could be replaced with other processes described in this session of the 2019 AIChE Annual Meeting such as electrical power generation and biomass conversion to fuels.
Blockchain inefficiency
I would warn anyone reviewing the applicability of "blockchain" technology to beware of the fact that a blockchain is extremely inefficient method of accounting compared to a centralized trusted database. I would argue that utilizing a blockchain for accounting is only justified in applications where parties cannot be trusted to attempt to steal funds via chargebacks (e.g. a customer filling their automobile's gasoline tank but later claiming the credit card used to pay for the gasoline was stolen). Examples of blockchain inefficiency include:
- Data storage requirements. Blockchains maintain integrity of transaction history by recording every transaction ever performed. Due to this limitation, Bitcoin can only support approximately 3 to 7 non-LN transactions per second; up to approximately 1 megabyte of new data must be permanently stored every 10 minutes. As of the latest block (number 663,744), 362,023,310,633 bytes are required to store the Bitcoin blockchain.
- Proof-of-work energy requirements. Permission-less blockchains (e.g. Bitcoin) achieve decentralized consensus by expending electrical energy to perform mathematical calculations. See this 3Blue1Brown video for an explanation. This article estimates Bitcoin power consumption to be on the order of 4 gigawatts in January 2020.
- Time delay. In contrast to a centralized trusted database that can add a new record nearly instantaneously, Bitcoin transactions should not be considered finalized until at least an hour has passed; that is, until enough "confirmations" have occurred. This limit varies according to the rules of each blockchain type but is a function of the protocol's target time delay between each block of transactions (e.g. 10 minutes for Bitcoin, 2.5 minutes for Litecoin, image).
End.
------------------------------
Steven Baltakatei Sandoval
reboil.com
Vancouver WA
------------------------------
Ikiwiki Setup (part 2)
Created by Steven Baltakatei Sandoval on 2020-12-23T02:35Z under a CC BY-SA 4.0 license and last updated on 2020-12-23T15:15Z.
Update
I figured out how to migrate the git repository containing my legacy reboil.com website (an Amazon Web Services S3 bucket using Route 53 DNS routing (see this reference)) to an ikiwiki blog (this blog post you should be reading as of 2020-12-23). Using this handy procedure, I found it to be possible to maintain a line of continuity in git commit and log data in the git repository that the default ikiwiki configuration on my Freedombox uses.
I spent the past few hours figuring out how to get ikiwiki
, git
,
and cron
to play nicely together so I can push blog posts like this
using a git push
operation from my personal laptop without having to
login as root to my Freedombox via ssh
. I think I have something
working. Even if my configuration changes in the near future, the
ability to version-control my blog posts gives me peace of mind that I
can preserve an roll back my blog's content.
Here are steps that I used to set this up.
Setup Procedure
The setup basically consists of the following steps:
Create a
ikiwiki
blog.Creating a
gitweb
bare repository to initially mirror theikiwiki
repository.Push the initial
ikiwiki
repositorymaster
branch to thegitweb
repo.Add the
gitweb
repository as a remote in theikiwiki
repository.Create a
root
usercron
job on the Freedombox to periodically perform the required pushes, pulls, andikiwiki
rebuilds.
With all these preparations in place, it will then be possible to create a new blog post via two methods:
Creation, committing, and signing of a new blog post markdown file in a local clone of the
gitweb
repository on a personal computer.Creation or editing of a markdown file via
ikiwiki
's webCGI interface. (note: the commit is not signed and may later be manually pulled from/merged/pushed to thegitweb
repository).
The following sections will explain these steps in detail for a blog
named blog-test
. Once a working setup is confirmed functional,
repeat the steps for a blog named blog
.
Setup
ikiwiki
applicationInstall
ikiwiki
via the Freedombox plinth menus.Create a blog named
blog-test
. This will cause the URL of the blog to be shorter (i.e.https://reboil.com/ikiwiki/blog-test
).Setup
gitweb
applicationInstall
gitweb
via the Freedombox plinth menus.Create a git repository named
ikiwiki-blog-test
.Setup
ikiwiki
repositoryCreate a backup git bundle of the blog's initial state via:
$ sudo su - # pushd /var/lib/ikiwiki/blog-test # git bundle create ~/$(date --iso-8601=seconds)..blog-test.bundle --all
Add the
gitweb
remote (a local directory) via:# git remote add localgitweb /var/lib/git/ikiwiki-blog-test.git
The
ikiwiki
repo remote list should read:/var/lib/ikiwiki/blog-test# git remote -v localgitweb /var/lib/git/ikiwiki-blog-test.git (fetch) localgitweb /var/lib/git/ikiwiki-blog-test.git (push) origin /var/lib/ikiwiki/blog-test.git (fetch) origin /var/lib/ikiwiki/blog-test.git (push)
The
ikiwiki
repo branch list should read:/var/lib/ikiwiki/blog-test# git branch --all * master remotes/localgitweb/master remotes/origin/master
Setup
gitweb
repositoryIn the Freedombox Plinth webUI, under Apps, under Gitweb, create a repository named
ikiwiki-blog-test
.From the
ikiwiki
repo working directory, push the blogmaster
branch to thegitweb
repo.# pushd /var/lib/ikiwiki/blog-test /var/lib/ikiwiki/blog-test# git push localgitweb master Enumerating objects: 12, done. Counting objects: 100% (12/12), done. Delta compression using up to 2 threads Compressing objects: 100% (11/11), done. Writing objects: 100% (12/12), 1.65 KiB | 141.00 KiB/s, done. Total 12 (delta 0), reused 0 (delta 0) To /var/lib/git/ikiwiki-blog-test.git * [new branch] master -> master
Create and push the
ikiwiki
repo'smaster
branch to themaster
branch on thegitweb
repo.# git push localgitweb master:master
The
ikiwiki
repo branches should look something like this:/var/lib/ikiwiki/blog-test# git branch --all /var/lib/ikiwiki/blog# git branch --all ikiwiki_revert_337dd2a3446701388bafc9616bccfaac659ca44a * master remotes/localgitweb/master remotes/origin/master
Setup
root
usercron
jobCreate a bash script file at
/root/ikiwiki_blog-test_update.sh
. The script should perform the following pseudocode functions:#!/bin/bash # Fetch updates echo "STATUS:Fetching updates." 1>&2 pushd /var/lib/ikiwiki/blog-test git fetch --all git checkout master # Push ikiwiki repo 'master' to gitweb repo 'master' # Ref/Attrib: https://stackoverflow.com/a/3206144 echo "STATUS:Pushing ikiwiki-master to gitweb-master." 1>&2 git push localgitweb master:master # Pull gitweb repo 'master' to ikiwiki repo 'master' # Ref/Attrib: https://stackoverflow.com/a/8888015 echo "STATUS:Pulling gitweb-master to ikiwiki-master" 1>&2 git pull localgitweb master:master # Push changes echo "STATUS:Push to ikiwiki" 1>&2 git push origin echo "STATUS:Rendering ikiwiki." 1>&2 ikiwiki --setup /var/lib/ikiwiki/blog-test.setup --rebuild --verbose --gettime popd
Create a
cron
job.# crontab -e
Have it contain the following line.
0 * * * * /bin/bash /root/ikiwiki_blog_update.sh 1>/dev/null 2>&1
You may have to manually perform an initial merge, but the
/var/lib/ikiwiki/blog-test
ikiwiki git repository should now automatically synchronize with the/var/lib/git/ikiwiki-blog-test.git
gitweb repository at the top of every hour.
Summary
Using a Freedombox, ikiwiki
, git
, gitweb
, and cron
, it is
possible to set up a blog that can be updated remotely via git push
.
This work by Steven Baltakatei Sandoval is licensed under CC BY-SA 4.0
Citation Needed Hunt
Created by Steven Baltakatei Sandoval on 2019-07-16T14:23Z under a CC BY-SA 4.0 license and last updated on 2020-12-22T19:10Z.
Citation Hunt
A tool for looking up random sentences with {{citation needed}} tags in Wikipedia is Citation Hunt.
This can be used to help find a random place to start improving Wikipedia.
It is not an efficient method for improving Wikipedia (given how easy for an editor to add a {{citation needed}} tag compared to how difficult it is to understand and locate an appropriate source). However, I think it is more useful than clicking Wikipedia's "Random article" link since it can help focus your mind on a single sentence claim; when I open a random wikipedia article and see dozens of reference tags and paragraphs of text the phrase "Where do I even start?" comes to mind.
References
- 1. "Citation Hunt". Website:Wikimedia Toolforge. Date Accessed: 2020-12-22. Archive link. Archive date: 2020-12-21.
This work by Steven Baltakatei Sandoval is licensed under CC BY-SA 4.0
Russian Roulette
Created by Steven Baltakatei Sandoval on 2019-07-18T05:04Z under a CC BY-SA 4.0 license and last updated on 2020-12-23T00:52Z.
Background
I took it upon myself to review a {{citation needed}}
tag on the
Russian roulette
page on Wikipedia.
I found a reference that cited the Oxford English Dictionary which itself cited a 1937-01-30 issue of Collier's, a magazine containing short stories. The issue conatined a short story named "Russian Roulette" by a person named Georges Surdez. I found a source for the document here and here.
It's interesting to me that a the Oxford English Dictionary cites a document that is rather obscure. It makes me wonder what a library filled with every source that the Oxford English Dictionary cites would look like. It seems like an ambitious project that would be necessary to preserve the english language's history in a technically satisfying manner. Something to think about.
Wikipedia edit
The wikipedia article containing the updated information as of 2019-07-16T22:54:07 is here:
I had removed usage of "Russian Poker" from a description of a 2019-01 incident in which a police officer shot another police officer in what the New York Times describes as "Russian Roulette" but which no source (which I could find) reporting on the incident described as "Russian Poker". I think using that particular phrase to describe an incident that no source describes as such would be creating information out of nothing ("original research"). In this case, the information created is the strengthening of the link between the phrase "Russian Poker" and the concept of pulling the trigger on a possibly-loaded firearm while aimed at another person. I said as much in my descriptions of the edits.
I confirmed that the Collier's quote is partially referenced in a printed copy of the OED2 (page 295) in my local library. The relevant sections are:
> `REVOLUTION` *sb*. `I I`; **Russian roulette**, an act of
bravado in which a person loads (usu.) one
chamber of a revolver, spins the cylinder, holds
the barrel to his head, and pulls the trigger; also
*fig*.;
> Revolution had never taken place. **1937** `G. SURDEZ` in
*Collier's* 30 Jan. 16 ‘Did you ever hear of Russian roulette?’
…With the Russian army in Rumania, around 1917,…some
officer would suddenly pull out his revolver,…remove a
cartridge from the cylinder, spin the cylinder, snap it back in
place, put it to his head and pull the trigger.
Citation Hunt
I had originally found this page to edit via a Citation Hunt webpage
that looks up random {{citation needed}}
tags in Wikipedia articles
and presents them to the user for consideration. URL is
here.
I'm also considering using markdown to format text but it hurts
legibility if I'm using vanilla emacs. (edit(2020-12-22T19:22Z
): I
rewrote this article in markdown.)
This work by Steven Baltakatei Sandoval is licensed under CC BY-SA 4.0
Kyoani Arson Attack
Created by Steven Baltakatei Sandoval on 2019-07-18T23:21Z under a CC BY-SA 4.0 license and last updated on 2020-12-23T00:53Z.
Wikipedia article
I helped to proofread references on the wikipedia article for the Kyoto Animation arson attack.
33 dead. Attack occured at KyoAni's Studio 1 facility where normally about 70 people work.
This work by Steven Baltakatei Sandoval is licensed under CC BY-SA 4.0
Markup formats
Written on 002019-07-28T19:48Z by baltakatei
Wikitext
Since I use emacs
as my editor I thought I'd see if there was a set
of emacs tools designed to facilitate editing wikitext (what you edit
when you click the "edit source" tab on the top of any Wikipedia
article).
One tool I wish that existed is the ability to automatically format default in-line references into a format more visible for human eyes. Here is an example from the wikitext for the Kyoto Animation arson attack article:
<ref>{{Cite web|url=https://www3.nhk.or.jp/nhkworld/en/news/20190727_02/|title=Nearly 6 million dollars donated after Kyoto blaze {{!}} NHK WORLD-JAPAN News|website=NHK WORLD|language=en|access-date=2019-07-27}}</ref>
I want a tool that can convert that text into this:
<ref>{{Cite web
|url = https://www3.nhk.or.jp/nhkworld/en/news/20190727_02/
|title = Nearly 6 million dollars donated after Kyoto blaze {{!}} NHK WORLD-JAPAN News
|website = NHK WORLD
|language = en
|access-date = 2019-07-27
}}</ref>
I have been reading up on how the Emacs Lisp language works so I can write my own custom function to perform this formatting automatically on a region of text I select in emacs (I prefer to edit article source in emacs since I really like the keyboard shortcuts). I'm currently learning the basics.
LaTeX
I also had some curiosity about possibly using emacs
for composing
documents using the LaTeX markdup language. I imagine that would be
useful for producing documents for explaining mathematical concepts in
general.
This blog post located at dated
2010-05-13, and titled Emacs as the Ultimate LaTeX Editor seems
promising. It recommends the use of the AUCTeX
package available in
the debian repository (wikipedia page). It had a
followup post which explained how LaTeX equation
previous could be seen within the emacs GUI editor.
Markdown
I also decided I'd try and write this page using
Markdown and the M-x markdown-preview
function
(which converts the markdown markup into an HTML file which it has my
browser open).
I figured out I can use the markdown
package to convert markdown
files into HTML via:
$ markdown mytestfile.md > mytestfile.html
It looks better than raw text files, in any case. Maybe one day I'll
get fancy and use texinfo
or something from which I can
auto-generate a static HTML website. For now, though, I'll focus on
getting stuff written.
:: :: :: ::
2019-08-09T18:14:36Z; baltakatei>
Four Freedoms and Three Purposes
Below are some notes regarding my thoughts on how to identify a purpose for one's actions after you have discovered that there is no built-in purpose engraved into the laws of physics. See existential nihilism. Skip to the
Four Freedoms
Source: Free Software Free Society (PDF)
Description: The Four Freedoms answer the question: "What abilities must a software programmer have in order to have control over computer programs they create?"
0. The freedom to run the program as you wish, for any purpose.
1. The freedom to study how the program works, and change it so it does your computing as you wish. Access to the source code is a precondition for this.
2. The freedom to redistribute copies so you can help your neighbor.
3. The freedom to distribute copies of your modified versions to others. By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.
I believe these principles can be generalized to cover any apparatus constructed of matter (including the human body and augmentations to human capabilities). As of 2019, I am unaware of these principles being applied in any significant scale to machinery required to sustain the current human population on planet Earth. Mechanical fabrication prints, P&IDs, PFDs, and industry consensus standards for machines involved in water purification, food production, waste processing, and other technologies required to sustain the current earthbound human population are mostly "closed-source". I imagine the four freedoms are not being applied to improvements in such technologies due to the fact that currently new improvements occur frequently. These improvements are protected by copyright and patent laws designed to accelerate creation of such improvements by granting patent owners temporary government-enforced monopolies in the manufacture of machines utilizing such improvements.
However, if new industrial machinery improvements are developed and released under "copyleft" licenses (ex: "Creative Commons") then collaborative efforts may outcompete the temporary government-enforced monopoly model (patents). For comparison, I direct you to the emergence of "free/libre, and open source software" (FLOSS, a.k.a. FOSS) development as the source of software used in most production environments (ex: GNU/Linux). This would mean that a person in a future where complex technology beyond the median human's understanding is required simply for basic human survival would at least have the option of "opting out" from having to agree to a license agreement in order to not die.
In in other words, the Free Software Foundation lacks an industrial arm to ensure people can choose to rely on freedom-respecting industrial machinery they need to survive. I think such an arm needs to be created, the sooner the better.
In order to control over one's existence, all tools one uses to survive must satisfy the four freedoms.
Four freedoms (generalized)
Redefine "program" to include "any machine" and "source code" to include "technical documentation and source code required to fabricate the machine".
Note: For generalization across all baryonic matter, include in definition of "machine": "atomically precise molecular assembler".
One problem I see in the four freedoms is that there is no explicit provision addressing use of software to destroy or inflict physical harm. I imagine this omission is an artifact of the fact that software requires hardware to interact with physical reality. Physical hardware capable of causing energy or matter to flow is needed to inflict physical harm. Most weapons work by dumping sufficient energy into a small enough volume of space (ex: bullets) or causing certain types of disruptive material to flow (ex: poison). A computer control program does not directly cause harm; the final control element does (a hammer strikes a firing pin; an actuator opens a valve on a poison canister; a metal switch completes an electrical circuit). This raises the question: "How does one apply the four freedoms to hardware that may be used to kill someone?". Traditional agriculture tools such as scythes, sickles, pitchforks, horses, shovels, and sledgehammers all can be used to create food or to kill and destroy.
The question becomes one of purpose and motive. In general, I understand the purpose of a nation-state government to be the holder of a monopoly on violence. Therefore, a nation-state should be interested in controlling the production and possession of weapons and tools which can be used as weapons. A nation-state with strong gun control actively profiles users of potentially lethal tools for past misbehavior and statistical likelihood misuse. In other words, gun control laws restricts freedom globally in order to reduce the number of individuals who may possess lethal tools giving them power to cause physical harm at scale since infliction of lethal physical harm itself deprives victims of all freedom.
Freedom is being able to make decisions that affect mainly you; power is being able to make decisions that affect others more than you. If we confuse power with freedom, we will fail to uphold real freedom. (Free Software, Free Society, v3, Ch. 46, Pg. 257)
The resriction of freedom for lethal tool imposed by a nation-state upon users may come in the form of punishment for possession in certain areas. It may also come in the form of complete prohibition of sale or possession of certain lethal tools.
However, nation-states come and go on the timescale of human generations. What principles should an existential nihilist follow when not even the laws of physics endorse any particular creed or moral code? If I want to help implement the four freedoms for hardware I should be prepared to explain the problem even to someone who doesn't necessarily share my value system.
This next section is going to be one huge tangent but it led to some interesting thoughts that I thought I'd make public.
Purpose for existence in a purposeless cosmos
A rational observer should conclude that there is no all-powerful God or intrinsic meaning to existence. Nation-states historically have been built around shared hallucinations of religion, humanism, or money. If negociations must be made with a group of inscrutable aliens or foreigners, what purposes for existing drive their value system? What purposes may already be shared in common between your familiar group and the foreign group prior to first contact? If coexisting with such foreigners is unavoidable and they share/possess lethal technologies without regard to your local government regulations, how can such common purposes be used to reestablish trade restrictions or to reevaluate the efficacy of existing restrictions?
These are thoughts that come to mind when I try to answer the question "What universal purposes might we share with foreigners whom we have never met?". The simple answer of "loving and supporting your family" comes to mind from my time working to open conversations with strangers as a missionary for the Mormon church. However, I want to define such a phrase methodically. What value and problems come from prioritizing resource allocation towards blood family members who may act irrationally as opposed to allocation towards non-blood friends who do act rationally?
An analogy to solving reaction rate kinetics
One strategy I have found useful when tackling a difficult question is to produce a set of answers, each element of which addresses a specific aspect of the original question. For example, when answering the question "What governs the amount of heat in a plastic polymerization reactor?", I would be inclined to answer with a rate equation composed of multiple terms added together. Usually it would be of the form:
(rate in) - (rate out) + (rate of generation) = (rate of accumulation)
The rates might be energy flows and/or mass flows. There may be
multiple equations, one for each type of material present within the
reactor. Two types of material might go in and therefore have positive
rate in
terms in their equations. If it is a steady-state reactor,
the rate of accumulation
term should be set to zero. The rate of
generation
might be the generation of heat that must be dissipated by
convection, diffusion, and radiation processes. Concentrations of
material might be invovled. Each rate may be a function of chemical
concentrations of various combiantions of elements according to a
separate set of rate governing equations. See
https://che.engin.umich.edu/people/scott-fogler/ .
However, my point is that a seemingly unsolvable problem can be made usefully solvable by splitting the problem into simpler component processes.
Components of 42
With this in mind, I thought it might be useful to split up the question "What is a common purpose for existing that any sentient being might share with me?" (a stand-in for Douglas Adams' more ambitious and vague question "What is the answer to Life, the Universe, and Everything" to which the answer is famously, "42") into a set of "purposes" which a sentient (read: thinking) being made of matter might set as its purpose for living. Obviously, the amount of possible purposes and permutations of purposes makes the assembly of an exhaustive list impossible. But by aiming to reduce the number of purposes to the minimum required to generate all purposes, then perhaps a set of fundamental "meanings of existence" might be reached.
Each "purpose" that on could dedicate their life to is an action. "To make lots of money" might be one but what is money for and why not something like seashells or gold? "To support my family" is a common one I think many in family-oriented religions would agree is their purpose for existing but is there a more general way of describing what a family is that could also describe other human relationship configurations like mentorship or slavery or even money? If I could define such a system then I would have a minimum moral code that I could reasonably expect another rational observer of reality to have adopted already. Like weathermen who never met one another yet still able to share stories with useful information based the fact that they have been observing the same givens, perhaps I can use this minimum set of purposes as a communication tool for my friends and strangers we meet in the future. Ideas such as the four freedoms described above require a definition for "free/libre" which I may be able to describe within this system.
The end result of these ruminations are a set of four actions which I've converted into pairs of words in the style of english legal doublets (ex: cease and desist, free and clear, null and void, terms and conditions, aid and abet). I am calling three of them "self-evident purposes" since they are living processes that should be based only upon what other sentient beings can be expected to share in common with us. A fourth action is an implied aspect for the three self-evident purposes but isn't a purpose by itself ("Destroy and Forget"). However, it is a self-evident action present in every other process which deserves identification. After that, I proceed to define actions that can be described as permutations of the self-evident purpose-actions but which I do not necessarily believe may be shared among even humans on earth as purposes, much less all sentient entities. Some of the derived actions may be life purposes.
Self-evident system of purpose
Self-evident purposes
Source: introspection
These actions are answers to the question: "What common purposes for existence are we likely to share with any stranger we encounter?"
The question, in other words: "What are members of the smallest set of actions that you will likely share with any alien/stranger?"
👁 Observe and Discover - To see reality as it really is. To take in new givens.
📚 Integrate and Correlate - To create stories explaining what you see. To record history. To correlate givens with other givens.
🖧 Reticulate and Network - To trade stories with entities different from yourself. To form relationship networks with others.
Self-evident actions
- 💣 Destroy and Forget - To nullify an action.
Note on Destroy and Forget
This is a weird one since I have a hard time imagining a sentient
being whose nature is Destruction but it is a fundamental action
required to derive many other actions. It feels like the concept of
"zero" in math. Multiplying by 0
is not useful when performed in
isolation but is required when part of a process of selectively
ignoring aspects of a signal for the purpose of amplifying useful
signals. Part of the act of creation is the removal of the
construction waste. Observe and Discover involves collecting data;
part of integrating and correlating information is the ignoring of
many observed data points in favor of highlighting certain data points
that promote/accelerate all activities. Likewise, if only creation and
strengthening of relationships was permitted with no ability to
dissolve/unlink relationships then new relationships would be
inhibited if material resources must be dedicated to maintain each
relationship; Destruction of old relationships (chemical bonds or
social ties) is necessary.
However, Destroy and Forget isn't really a useful "purpose to live" unless used in combination with the other self-evident actions. Likewise, to Observe and Discover isn't really useful if data is not correlated or shared. Sharing of data can happen unwillingly (forced reticulation) as can destruction (unwilling destruction).
Derived Actions
Below are other actions defined in terms of the above self-evident common actions:
Touch - To form relationships (Reticulation) mediated by physical forces (electron-election repulsion, photon emission/absorption).
Create - To simultaneously Integrate and Touch.
Defend - To secure integrity of other actions via Touch.
Replicate - To Create copies of Observers in order to increase the number of points of view from which reality can be Observed or to Defend against Destruction.
Expand - To increase spatial scope of Observation or Reticulation activities.
Control - To selectively Destroy actions (Observation, Integration, Reticulation, and all permutations of such).
Conquer - To Control in order to Expand.
Separate - To selectively Destroy undesired relationships between certain entities.
Control - To form a relationship
Liberate - To Destroy Control.
(etc.)
Comments
I am not completely committed to there being only three self-evident purposes. Perhaps another can be added but I think there should be a small number for this system to be useful.
One idea that amuses me is for a society that divides itself into different factions, each with certain principle actions guiding faction members.
A person of an Observe and Discover faction might primarily be involved with activities that expand the scope of what a civilization knows. This might include radio astronomy telescopes and cosmology. It might also have subfactions dedicated towards more introspective observations such as internal surveillance of a civilization's activities. Researchers would be primarily focused on developing tools that allow them to see farther.
A person of an Integrate and Correlate faction would primarily be involved in fact-checking and using observations from O&D to update abstract models of the civilization's internal state, the civilization's impact on external reality, and creating predictions of possible future problems based on history.
A person of a Reticulate and Network faction would primarily be involved in forming communication channels between nodes within and without of the civilization in order to collect useful information from internal and foreign I&C factions.
Many other actions such as Defend or Separate or Conquer or Liberate could become focal points of many factions. However, any civilization, local or foreign, would be guaranteed to have the factions of O&D, I&C, and R&N. As effects of the ongoing heat death of the cosmos continue, these self-evident purposes will persist into deep time. Members of civilizations that nullify these common actions will handicap themselves.
Nullification of Observe and Discover causes blindness.
Nullification of Integrate and Correlate causes madness.
Nullification of Reticulate and Network causes ignorance.
Application to the Four Freedoms
How does this system of belief help me to apply the Four Freedoms in a more general context that includes everything from factories to molecular assemblers? I'm not sure. One thing I do know is that my butt is tired from all this typing while sitting.
I made this tangent into self-evident purposes for existence with the hope that I could identify a way to explain to someone the value of applying the Four Freedoms even to potentially lethal machinery. The dream I have been trying to find a way towards is one in which every person has the option to manufacture their own life support equipment in an uninhabitable environment such as the vacuum of space, any future human space colony, even the surface of planet Earth itself if exponential population growth continues, or even a virtual environment where all humans are forced to live as non-biological emulated brains in a completely artificial substrate in machine cities. I went on a tangent because I wanted to explore a belief system that might survive even in places where people might live under even in combinations of such inhospitable environments. As of the year 2019, most humans would die if they switched to hunter-gatherer mode instead of relying upon machines to feed, clothe, and shelter them.
So what does these self-evident purposes buy me regarding applying the Four Freedoms to all hardware?
I think relevant questions for people concerned about losing their own freedoms caused by application of the Four Freedoms to all machines are:
What happens when anyone can print a fission bomb from raw materials in hours?
What happens when anyone can print a firearm on a whim in minutes?
The O&D and I&C factions must step up to the plate and keep up with manufacturing technology advances in order to search for signatures of uranium refining and other signs that advanced weaponry are being fabricated. R&N factions must quickly share information to track patterns of material movement. Human fragility must be buttressed by brain emulation backup and/or clone bodies that can withstand disasterse. Law enforcement I&C factions must augment themselves to track and prosecute crimes as fast as new techniques are developed.
The pattern I am seeing here as I talk myself through the problem of Four Freedom lethal machines is speed and complexity. The faster a threat can be developed, the faster society-approved local law enforcement must be able to act and neutralize such threats. If the risk of physical harm is too high then physical redundancies must be planned and implemented to minimize damage. The benefits of Four Freedom manufacturing hardware must outweigh the new threats with increased capabilities offered to people who will defend their access to blueprints despite the increase in personal risk. Automobiles are lethal machines that are a significant cause of deaths in the United States but users defend their rights to use them because of the benefits of personal mobility they offer. Licenses and law enforcement mitigate the risk of misuse but do not eliminate it.
Firearms cause massacres in the United States regularly yet there is a cultural inertia among lawmakers and people that vote for such lawmakers that causes them to refuse to ban firearms. There is no perceivable economic benefit to firearm ownership. The perceived benefit seems to me to be primarily imaginary: "Owning this gun gives me the power to defend myself with lethal force and that makes me feel safe."
Given that firearm ownership is something that remains fiercely defended in the United States, I imagine that at least one nation-state will permit Four Freedom machines to exist and become part of the local culture. The fact that no significant population actively promotes Four Freedom philosophy for manufacturing is probably because the population of "programmers" (ex: engineers on industry consensus standard technical committees) is low and they do not perceive any urgency for free/libre machinery.
Possible fertile ground for the idea of Four Freedom machines are discussions of "Right to Repair" and disgust about planned obselescence. For example, several news stories discussed how farmers were pirating software required to operate John Deere agricultural equipment which apparently uses an expensive license model completely at odds with how farm equipment has traditionally been maintained.
One argument that is coalescing in my mind as I write these thoughts is that if Four Freedoms aren't applied to industrial consensus standards and fabrication blueprints, then a larger and larger fraction of living humans will be priced out of the ability to participate in society. More and more of their resources will be required to buy licenses for services required to maintain employment and the certifications employers require.
Additionally, even "middle class" citizens of an industrialized nation will be vulnerable to actions of the relatively small number of licensors of life support technologies where a Four Freedom machine equivalent option is not available. As manufacturing processes become more centralized for sake of efficiency, repair of "turn key" services such as automobile packages, municipal water treatment equipment, road maintenance equipment, material transport equipment (piping), and other infrastructure products will become more and more subject to license and service agreements. Already I know variable speed drive water pumps come equipped with bluetooth tranceivers that can only be configured via an app and bluetooth device that can cost significant amounts of money with. The pump manufacturer could charge money for the app and the closed source nature of the app makes the pump vulnerable to cyberattack. Problems that the Free Software Foundation argued its Four Freedoms were protecting people from in the realm of software will become more problematic in realms of industrial equipment as the equipment becomes more "smart". This will especially apply to a future where local 3D printing of industrial equipment becomes commonplace. If the digital programs fed to "matter compilers" (MCs) do not come from design "manufacturers" with the source code (today, the equivalent of P&IDs, instrumentation diagrams, mechanical drawigns, control philosophy, etc.), then the MC owners are subject to the designer's will similar to how Microsoft strongarmed its users to use Internet Explorer instead of competing web browsers. Hardware manufacturing can be more free but there has to be an active force for freedom. Otherwise the path of least resistance is centralized control by a small number of licensors.
Conclusion
I'll end this rather lengthy rambling blog post with a short summary. The Four Freedoms applied to the eralm of industrial machinery will force civilization to augment its speed and detection capabilities for lethal tool fabrication. Lethal tool fabrication increases risk of loss of your individual freedom if that tool is used against you. In a tangent I discussed a recurring thought I had regarding a universal set of purposes for living that may help any person to build communciation bridges with foreigners unlike yourself who may think Four Freedom machines are a pipe dream. I conclude by discussing the recent appearance of the Right to Repair idea and how it may be a fertile ground for discussing Four Freedom industrial machinery. I close discussing the value of Four Freedom machinery and potential negative consequences of failing to use Four Freedom machinery (loss of freedom).
baltakatei 🅭🅯🄎 4.0
<!DOCTYPE html>
Introduction
This is an explanation of how a distance-bounding protocol can be performed using the sending of bit sequences between Verifier V to Prover P.
MathML browser support is required. Firefox 60.0 esr should render MathML symbols correctly.
Background
Explanation, part 1: setup
I will attempt to explain my understanding of the Hancke-Khun distance-bounding protocol in my own words for a layman reader. If there are errors or misrepresentations then I am at fault.
Gerhard P. Hancke and Markus G. Kuhn proposed a distance-bounding protocol as a defense against man-in-the-middle attacks for people who use RFID tokens in order to automatically authenticate themselves for a location-based service such as the opening of a door or purchase at a specific point-of-sale device.
An example of a man-in-the-middle attack for such a building access-control could be two attackers maliciously forwarding radio traffic between an RFID token and a building RFID reader without the RFID token owner's knowledge even in the case where the token is located at a great distance from the reader. The idea to strengthen an RFID token against such an attack is to equip the building RFID reader with some means of proving the token is physically located within a specific distance.
The goal of this project is to apply this concept to the ping time between two computers in order to prove how close the computers are from eachother. A distance-bounding protocol proof uses the distance, speed, and time equation solved for distance.
distance = speed ⋅ time
The speed is set to the speed of light since one conclusion from the theory of special relativity is that no information signal or material can travel faster than light in a vacuum. The time is set to half the ping time (round trip time divided by 2.
distance = speed of light ⋅ ping time 2
In the protocol, a verifier, V, and a prover, P, create a pair of one-time-use pseudorandom bit sequences, R0 and R1, each containing n elements. Each element Ri0 or Ri1 is a bit whose value is either 0 or 1. These sequences can be represented like so:
Ri0 = R10 R20 R30 R40 R50 ⋯ Rn0
Ri1 = R11 R21 R31 R41 R51 ⋯ Rn1
Regarding these bit sequences, V rapidly asks P a stream of n questions. A question may take only one of the two forms:
What is the ith bit of R0, Ri0?
What is the ith bit of R1, Ri1?
The stream of questions start with i=1 and end with i=n.
In order to decide which question V asks P, V generates a private random bit sequence, C, which consists of n elements. The rule V follows is that if Ci=0 then V requests that P supply Ri0. If Ci=1 then V requests that P supply Ri1. In other words, at each round, i, V randomly chooses which of the two questions to ask P.
After sending a question to P, V records the exact time and increments i by 1.
Because cause must precede effect, P cannot provide a correct answer to V until after P receives the question. Since the speed of light is the maximum rate at which any information can travel through space, there is a minimum ping time (or "time of flight") for any given distance between V and P which can be used by the protocol to prove an upper bound to the distance between V and P.
Immediately after receiving a question, P sends to V the value RiCi which is the requested bit from either R0 or R1. The set of these responses can be written as RCi.
Upon receiving each response, V records the exact time in order to calculate that particular question-response round-trip time (or "ping time").
Example 1: how the bit sequences are used
To help explain how this process works below is an example that sets n=16 and walks you through how to calculate the response bit sequence, RCi.
Verifier V and Prover P assemble and agree upon pseudorandom bit sequences R0 and R1
Ri0 = 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 0
Ri1 = 1 0 0 0 1 1 1 1 0 1 1 0 1 0 0 1
Verifier V secretly produces a pseudorandom bit sequence Ci:
- Ci = 0 0 0 0 1 0 1 1 0 0 0 1 1 1 0 1
V sends each bit of Ci , one at a time, starting from i=1 until i=n. V notes the exact time when it sent each value of Ci.
P receives and uses each bit of Ci to determine whether to immediately send the bit Ri0 or Ri1 to V in response. If all bits are received and sent without error, P will eventually have sent the set RCi.
V receives and records the arrival time for each response bit, RiCi. V calculates the round-trip time for each round. The resulting values of RiCi are:
- RiCi = 0 1 0 0 1 0 1 1 1 0 1 0 1 0 1 1
Below is a table illustrating how the example values for these bit sequences correlate. I have bolded the values of Ri0 and Ri1 which were sent by P in response to the values sent of Ci sent by V.
i | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ri0 | 0 |
1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 |
Ri1 | 1 |
0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 |
Ci | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
1 | 1 | 0 | 1 |
RiCi | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Explanation, part 2: False-Acceptance Rate
At each step V records the round trip time required between the sending of the question and the receiving of the correct answer from P. Given enough correct answers from P, V can then use the average value of the round trip time, tm, of correct responses in order to calculate with some statistical certainty that P is physically located within a distance, d. The distance, d can be calculated using the following two equations (pg 68, Hancke 2005).
d = c ⋅ tm−td 2
tm=2⋅tp+td
In the language of the Hancke paper, variables in the two equations are defined as:
c is the propagation speed, tp is the one way propagation time, tm is the measured total round-trip time, and td is the processing delay of the remote device.
A conservative practice defines td=0 for the processing delay variable. It is conservative because td is a function of the capabilities of the hardware P uses to process requests from V. If both P and V trust eachother to use specific hardware with consistent and accurate estimates for response times then td may be specified. However, the Hancke protocol-Kuhn does not provide a means for proving or incentivizing P to accurately measure and report its own hardware capability.
The highest possible propagation speed, c, according to the laws of physics is the speed of light in a vacuum. According to section 2.1.1.1 of the 8th edition of the International System of Units, a document published by the International Bureau of Weights and Measures, this speed is 299 792 458ms.
The statistical certainty that the round-trip time between P and V is less than tm is 1−pFA where pFA is the "false-accept probability". The value of pFA must be a statistical estimate constrained by the possibility that prover, P, maliciously sends its best guesses before receiving the questions from V. If P dishonestly wishes to convince V that the distance is lower than it really is, then P can achieve a 34 probability of guessing correctly for a given round without having yet received that round's value of Ci. This is because, on average, half of the rounds do not require guessing at all since half the time Ri0=Ri1. The other half of the time P's best strategy, assuming V generated C securely, is to guess 1 or 0 at random.
The false acceptance probability, or "False-Acceptance Rate", pFA, of V accepting the distance-bounding protocol proof of P can be calculated using the following equation found on the sixth page of the Hancke paper. This equation calculates pFA assuming V judges that receiving k correct responses out of n total rounds is acceptable.
p FA = ∑ i = k n n i ⋅ 3 4 i ⋅ 1 4 n − i
The equation states that pFA is equal to the sum of each individual probability where P guessed correctly <mathk or more times (for example: one outcome exists where P guesses perfectly, some outcomes where P makes only one mistake, some outcomes where P makes two mistakes, etc.). The total number of terms in the sum is of n−k+1.
In other words, the final term (the n'th term) of the sum is the probability that P guesses correctly in exactly every single response (one very rare possibility). The penultimate term (the n−1'th term) is the probability that P guesses correctly every single time except for exactly one mistake somewhere (a slightly less rare possibility). The n−2'th term is the probability that P guesses all responses correctly but with two errors somewhere. The n−3'th term is the probability that P guesses all responses correctly but with three errors somewhere, and so forth. The first term of the sum is the probability that P guesses correctly exactly k times out of n responses and therefore provided incorrect responses exactly n−k times. Each term of the sum is the binomial probability function (a.k.a. "binomial distribution formula" or "probability mass function") which should be part of the syllabus for any a typical Statistics course.
Since no factor of the equation for pFA can be made exactly equal to zero it is impossible for Verifier V to completely eliminate the possibility that P could forge this distance-bounding proof. The best V can do to strengthen confidence in the proof's validity is to set the parameters k and n to values that produce an acceptably low value for pFA, the probability of falsely accepting a maliciously constructed proof by Prover P.
Example 2: Calculating False-Acceptance Rate
Below is a copy of the previous example table but with values of Ri0 and Ri1 bolded when Ri0=Ri1. From inspection it should be clear that P does not have to guess roughly half of the rounds since a quarter of the time Ri0=Ri1=0 and a quarter of the time Ri0=Ri1=1.
i | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ri0 | 0 |
1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 |
Ri1 | 1 |
0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 |
Ci | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
1 | 1 | 0 | 1 |
RiCi | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Side note: I believe the inefficiency of allowing the protocol to have instances where Ri0=Ri1 is due to Hancke designing the protocol to be simple in order to accomodate implementation in RFID tags with limited computatioinal ability and over noisy communication channels. The scope of this project doesn't include attempting to improve the protocol but to simply implement it as described in the Hancke paper.
In order to illustrate how the False-Acceptance Rate, pFA, is calculated, let us say that V was programmed to accept 14 correct responses out of 16 (k=14, n=16). In this case pFA could be calculated as described below.
The binomial coefficient factor in the pFA equation can be expanded out, with ! signifying the factorial operation (for example, 5!=5⋅4⋅3⋅2⋅1=120).
p FA = ∑ i = k n n ! i ! n − i ! ⋅ 3 4 i ⋅ 1 4 n − i
The sum consists consist of a total of n−k+1=16−14+1=3 terms.
The last term (i=n=16) is:
16 ! 16 ! 16 − 16 ! ⋅ 3 4 16 ⋅ 1 4 16 − 16 = 1.00226 ⋅ 10 -2
The penultimate term (i=15) is:
16 ! 15 ! 16 − 15 ! ⋅ 3 4 15 ⋅ 1 4 16 − 15
= 5.34538 ⋅ 10 -2
The first term (i=k=14) is:
16 ! 14 ! 16 − 14 ! ⋅ 3 4 14 ⋅ 1 4 16 − 14
= 1.33635 ⋅ 10 -1
The sum of these three terms is:
1.00226 ⋅ 10 -2
+
5.34538 ⋅ 10 -2
+
1.33635 ⋅ 10 -1
= 1.97111 ⋅ 10 -1
Therefore, the False-Acceptance Rate, pFA can be written as:
p FA = ∑ i = k = 14 n = 16 n ! i ! n − i ! ⋅ 3 4 i ⋅ 1 4 n − i
= 1.97111 ⋅ 10 -1
= 19.7111 %
In other words, if V decides to accept only k=14 or more correct bits from from P out of a possible n=16 bits in the bit sequences they exchange, then there is about a 19.7% chance that P could fool V into accepting that the distance between them was lower than it physically is. P could do this by completely disregarding V's questions, C, and sending only best guesses for bit sequence RCi given the structure of R0 and R1.
<!DOCTYPE html>
Explanation of the Hancke-Kuhn Distance-Bounding Protocol
Created by Steven Baltakatei Sandoval on 2019-08-15T06:46:35Z under a CC BY-SA 4.0 license and last updated on 2019-08-15T20:16:52Z.
Introduction
It is possible to determine how far two computers are from eachother using the speed of light and ping time. The physical distance is, at most, the ping time multiplied by the speed of light. This documents explains the Hancke-Kuhn protocol that can calculate this upper bound for the distance between a Verifier V and a Prover P through the sending and receiving of certain bit sequences. This calculation is useful for location-based authentication technology (ex: RFID, contactless payment) defending against man-in-the middle attacks.
I have written this explanation in order to help solidify my own understanding of the protocol before I write my own implementation of it at my GitLab repository. It is an explanation in my own words. Any errors or misrepresentations are entirely my own.
A more detailed summary with references to academic papers was published by Cristina Onete which may be found here on her website's publication page.
This document makes use of MathML for displaying equations. Firefox 60.0 esr should render MathML symbols correctly.
Background
Explanation, part 1: setup
In 2005, Gerhard P. Hancke and Markus G. Kuhn proposed a distance-bounding protocol as a defense against man-in-the-middle attacks for people who use RFID tokens in order to automatically authenticate themselves for a location-based service such as the opening of a door or purchase at a specific point-of-sale device.
An example of a man-in-the-middle attack for such a building access-control could be two attackers maliciously forwarding radio traffic between an RFID token and a building RFID reader without the RFID token owner's knowledge even in the case where the token is located at a great distance from the reader. The idea to strengthen an RFID token against such an attack is to equip the building RFID reader with some means of proving the token is physically located within a specific distance.
The goal of this project is to apply this concept to the ping time between two computers in order to prove how close the computers are from eachother. A distance-bounding protocol proof uses the distance, speed, and time equation solved for distance.
distance = speed ⋅ time
The speed is set to the speed of light since one conclusion from the theory of special relativity is that no information signal or material can travel faster than light in a vacuum. The time is set to half the ping time (round trip time divided by 2.
distance = speed of light ⋅ ping time 2
In the protocol, a verifier, V, and a prover, P, create a pair of one-time-use pseudorandom bit sequences, R0 and R1, each containing n elements. Each element Ri0 or Ri1 is a bit whose value is either 0 or 1. These sequences can be represented like so:
Ri0 = R10 R20 R30 R40 R50 ⋯ Rn0
Ri1 = R11 R21 R31 R41 R51 ⋯ Rn1
Regarding these bit sequences, V rapidly asks P a stream of n questions. A question may take only one of the two forms:
What is the ith bit of R0, Ri0?
What is the ith bit of R1, Ri1?
The stream of questions start with i=1 and end with i=n.
In order to decide which question V asks P, V generates a private random bit sequence, C, which consists of n elements. The rule V follows is that if Ci=0 then V requests that P supply Ri0. If Ci=1 then V requests that P supply Ri1. In other words, at each round, i, V randomly chooses which of the two questions to ask P.
After sending a question to P, V records the exact time and increments i by 1.
Because cause must precede effect, P cannot provide a correct answer to V until after P receives the question. Since the speed of light is the maximum rate at which any information can travel through space, there is a minimum ping time (or "time of flight") for any given distance between V and P which can be used by the protocol to prove an upper bound to the distance between V and P.
Immediately after receiving a question, P sends to V the value RiCi which is the requested bit from either R0 or R1. The set of these responses can be written as RCi.
Upon receiving each response, V records the exact time in order to calculate that particular question-response round-trip time (or "ping time").
Example 1: how the bit sequences are used
To help explain how this process works below is an example that sets n=16 and walks you through how to calculate the response bit sequence, RCi.
Verifier V and Prover P assemble and agree upon pseudorandom bit sequences R0 and R1
Ri0 = 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 0
Ri1 = 1 0 0 0 1 1 1 1 0 1 1 0 1 0 0 1
Verifier V secretly produces a pseudorandom bit sequence Ci:
- Ci = 0 0 0 0 1 0 1 1 0 0 0 1 1 1 0 1
V sends each bit of Ci , one at a time, starting from i=1 until i=n. V notes the exact time when it sent each value of Ci.
P receives and uses each bit of Ci to determine whether to immediately send the bit Ri0 or Ri1 to V in response. If all bits are received and sent without error, P will eventually have sent the set RCi.
V receives and records the arrival time for each response bit, RiCi. V calculates the round-trip time for each round. The resulting values of RiCi are:
- RiCi = 0 1 0 0 1 0 1 1 1 0 1 0 1 0 1 1
Below is a table illustrating how the example values for these bit sequences correlate. I have bolded the values of Ri0 and Ri1 which were sent by P in response to the values sent of Ci sent by V.
i | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ri0 | 0 |
1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 |
Ri1 | 1 |
0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 |
Ci | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
1 | 1 | 0 | 1 |
RiCi | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Explanation, part 2: False-Acceptance Rate
At each step V records the round trip time required between the sending of the question and the receiving of the correct answer from P. Given enough correct answers from P, V can then use the average value of the round trip time, tm, of correct responses in order to calculate with some statistical certainty that P is physically located within a distance, d. The distance, d can be calculated using the following two equations (pg 68, Hancke 2005).
d = c ⋅ tm−td 2
tm=2⋅tp+td
In the language of the Hancke paper, variables in the two equations are defined as:
c is the propagation speed, tp is the one way propagation time, tm is the measured total round-trip time, and td is the processing delay of the remote device.
A conservative practice defines td=0 for the processing delay variable. It is conservative because td is a function of the capabilities of the hardware P uses to process requests from V. If both P and V trust eachother to use specific hardware with consistent and accurate estimates for response times then td may be specified. However, the Hancke protocol-Kuhn does not provide a means for proving or incentivizing P to accurately measure and report its own hardware capability.
The highest possible propagation speed, c, according to the laws of physics is the speed of light in a vacuum. According to section 2.1.1.1 of the 8th edition of the International System of Units, a document published by the International Bureau of Weights and Measures, this speed is 299 792 458ms.
The statistical certainty that the round-trip time between P and V is less than tm is 1−pFA where pFA is the "false-accept probability". The value of pFA must be a statistical estimate constrained by the possibility that prover, P, maliciously sends its best guesses before receiving the questions from V. If P dishonestly wishes to convince V that the distance is lower than it really is, then P can achieve a 34 probability of guessing correctly for a given round without having yet received that round's value of Ci. This is because, on average, half of the rounds do not require guessing at all since half the time Ri0=Ri1. The other half of the time P's best strategy, assuming V generated C securely, is to guess 1 or 0 at random.
The false acceptance probability, or "False-Acceptance Rate", pFA, of V accepting the distance-bounding protocol proof of P can be calculated using the following equation found on the sixth page of the Hancke paper. This equation calculates pFA assuming V judges that receiving k correct responses out of n total rounds is acceptable.
p FA = ∑ i = k n n i ⋅ 3 4 i ⋅ 1 4 n − i
The equation states that pFA is equal to the sum of each individual probability where P guessed correctly <mathk or more times (for example: one outcome exists where P guesses perfectly, some outcomes where P makes only one mistake, some outcomes where P makes two mistakes, etc.). The total number of terms in the sum is of n−k+1.
In other words, the final term (the n'th term) of the sum is the probability that P guesses correctly in exactly every single response (one very rare possibility). The penultimate term (the n−1'th term) is the probability that P guesses correctly every single time except for exactly one mistake somewhere (a slightly less rare possibility). The n−2'th term is the probability that P guesses all responses correctly but with two errors somewhere. The n−3'th term is the probability that P guesses all responses correctly but with three errors somewhere, and so forth. The first term of the sum is the probability that P guesses correctly exactly k times out of n responses and therefore provided incorrect responses exactly n−k times. Each term of the sum is the binomial probability function (a.k.a. "binomial distribution formula" or "probability mass function") which should be part of the syllabus for any a typical Statistics course.
Since no factor of the equation for pFA can be made exactly equal to zero it is impossible for Verifier V to completely eliminate the possibility that P could forge this distance-bounding proof. The best V can do to strengthen confidence in the proof's validity is to set the parameters k and n to values that produce an acceptably low value for pFA, the probability of falsely accepting a maliciously constructed proof by Prover P.
Example 2: Calculating False-Acceptance Rate
Below is a copy of the previous example table but with values of Ri0 and Ri1 bolded when Ri0=Ri1. From inspection it should be clear that P does not have to guess roughly half of the rounds since a quarter of the time Ri0=Ri1=0 and a quarter of the time Ri0=Ri1=1.
i | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ri0 | 0 |
1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 |
Ri1 | 1 |
0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 |
Ci | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
1 | 1 | 0 | 1 |
RiCi | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Side note: I believe the inefficiency of allowing the protocol to have instances where Ri0=Ri1 is due to Hancke designing the protocol to be simple in order to accomodate implementation in RFID tags with limited computatioinal ability and over noisy communication channels. The scope of my [Proof of Ping project][glbk_2019_pypop] doesn't include attempting to improve the protocol but to simply implement it as described in the Hancke paper.
In order to illustrate how the False-Acceptance Rate, pFA, is calculated, let us say that V was programmed to accept 14 correct responses out of 16 (k=14, n=16). For this case the calculation of pFA is detailed in this spreadsheet file (in ODS format) as well directly below.
The binomial coefficient factor in the pFA equation can be expanded out, with ! signifying the factorial operation (for example, 5!=5⋅4⋅3⋅2⋅1=120).
p FA = ∑ i = k n n ! i ! n − i ! ⋅ 3 4 i ⋅ 1 4 n − i
The sum consists consist of a total of n−k+1=16−14+1=3 terms.
The last term (i=n=16) is:
16 ! 16 ! 16 − 16 ! ⋅ 3 4 16 ⋅ 1 4 16 − 16 = 1.00226 ⋅ 10 -2
The penultimate term (i=15) is:
16 ! 15 ! 16 − 15 ! ⋅ 3 4 15 ⋅ 1 4 16 − 15
= 5.34538 ⋅ 10 -2
The first term (i=k=14) is:
16 ! 14 ! 16 − 14 ! ⋅ 3 4 14 ⋅ 1 4 16 − 14
= 1.33635 ⋅ 10 -1
The sum of these three terms is:
1.00226 ⋅ 10 -2
+
5.34538 ⋅ 10 -2
+
1.33635 ⋅ 10 -1
= 1.97111 ⋅ 10 -1
Therefore, the False-Acceptance Rate, pFA can be written as:
p FA = ∑ i = k = 14 n = 16 n ! i ! n − i ! ⋅ 3 4 i ⋅ 1 4 n − i
= 1.97111 ⋅ 10 -1
= 19.7111 %
In other words, if V decides to accept only k=14 or more correct bits from from P out of a possible n=16 bits in the bit sequences they exchange, then there is about a 19.7% chance that P could fool V into accepting that the distance between them was lower than it physically is. P could do this by completely disregarding V's questions, C, and sending only best guesses for bit sequence RCi given the structure of R0 and R1.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
This blog is powered by ikiwiki.
Text is available under the Creative Commons Attribution-ShareAlike license; additional terms may apply.