Running a standalone JACK daemon after migration to PipeWire

I’ve recently had a bunch of problems with my audio setup on Linux after the migration to PipeWire. Here’s a summary for those who might run into similar problems.

  • System: Debian testing
  • Audio interface: Steinberg UR44C

First, jackd would no longer start. I worked around it by compiling jack2. But that also stopped working: the output volume dropped to almost zero. So I tried the PipeWire compatibility layer. But then I couldn’t open 44.1kHz sessions, and the 48kHz sessions would glitch. In total, I was stuck for about three weeks.

After resolving all the issues, I’m using standalone jack daemon for audio work. I was under the impression that audio was often glitching when using PipeWire as backend. But I also experienced some of glitching when working with jackd, so my assessment might be wrong. It could be because of audio plugins, or something else entirely. I’ll give PipeWire a go again soon.

Problem 1: Can’t start jackd, RequestRelease is not implemented

This was the first problem I encountered and the last problem I solved.

The main error message was “Problems starting jackd: Method RequestRelease is not implemented”. Asking on Reddit helped, I finally found out that the fix is to configure PipeWire to not reserve audio devices:

properties {
    alsa.reserve = false

By default, there’s no configure file where you can put this line, so first you need to copy the example config into /etc and edit it:

sudo cp /usr/share/pipewire/media-session.d/alsa-monitor.conf \
sudo vim /etc/pipewire/media-session.d/alsa-monitor.conf

Then find the line with “alsa.reserve” and set it to false (no quotes). Save the file, and restart the pipewire service.

systemctl --user restart pipewire

I think the error message should be more friendly, for example, if PipeWire won’t release the device, it should say so explicitly and point the user at instructions on how to disable the reservation.

Problem 2: pw-jack ardour can’t open a 44.1kHz project

Solution: pw-metadata -n settings 0 clock.force-rate 44100

This is when working with Ardour and PipeWire as backend. I tried to open an Ardour project in 44.1kHz, using PipeWire compatibility mode.

pw-jack /opt/Ardour-6.9.0/bin/ardour6 /path/to/my_project.ardour

Ardour would say:

This session was created with a sample rate of 44100 Hz, but Ardour is currently running at 48000 Hz. If you load this session, audio may be played at the wrong sample rate.

Ardour, early 21st century

If you type “pw-jack -h”, it will tell you that it accepts “-s” as a command line argument to set the sample rate, but in my experience this doesn’t work. I asked about it on Ardour forums, and found out the solution, which is to temporarily override the sample rate for the PipeWire daemon.

pw-metadata -n settings 0 clock.force-rate 44100

This can be done as a regular user. To return to the default, use a 0 instead of 44100.

Maybe the JACK compatibility layer in PipeWire is incomplete or buggy.

Problem 3: Locally compiled jackd is almost mute

Solution: Compile version 1.9.19 and not 1.9.17 (default).

Before I figured out the alsa.reserve = false thing, I was trying different things to run a standalone jackd. For example, I downloaded jack2 sources, compiled them, and got a locally built jackd which didn’t throw the “Method RequestRelease is not implemented” error. (Not sure why, maybe it just didn’t try to do the DBus negotiation. Maybe it simply opened the audio device.)

This worked for a time, but recently the locally compiled jackd would output audio at a very low volume. Ardour output meters would show a loud output, but I had to turn the headphone volume knob all the way to the right to hear a distant echo of my recording.

The UR44C audio interface doesn’t seem to have an ALSA volume control. If I run alsamixer, press F6 and select UR44C, I get “This sound device does not have any controls.”

After some trial and error I realized that the problems seems to be specific to jack2 version 1.9.17, which is currently the default (branch “develop” is set to the “v1.9.17” tag), but there’s also a version 1.9.19 on github, and this one, compiled locally, worked just fine. Unfortunately I didn’t get to the bottom of it.

I hope this helps, if anyone runs into the same problems. Drop me a line if you figured out any further details!

My recorded track won’t sync

In the pandemic times many people try online music collaboration, and most people run into track synchronization issues. They’d play the backing track on one device (say, laptop) and record on another (say, phone). The recorded track would for the love of Zeus not want to synchronize with the backing track, no matter how much they dragged it left or right. It’s baffling and frustrating.

What causes this? Using two unsynchronized clocks [devices] does.

And how to do it correctly? Use one clock [device]. Make sure that that the same device plays the backing track(s) and records your performance.

Examples of setups that work:

  • A Digital Audio Workstation on a laptop or PC
  • A multitrack app on the phone (wired headphones recommended)
  • An audio recorder with the overdub function

Examples of setups that don’t work:

  • Any situation when you listen from one device and record on another.

You might think that clocks run at a constant speed, but it isn’t true. You can see it for yourself. Take any recording, about 5 minutes long, and import it into your DAW. Then play it back, and record it on your phone. Then import the recording from the phone into your DAW.

It won’t align.

It won’t align no matter how much you drag it back and forth. Either the beginning is in sync and the ending isn’t, or vice versa.

You might get away with short recordings, up to maybe 1 minute or minute and a half. They will still drift, just not enough for the drift to be perceptible.

Let’s say this represents the backing track:

A sequence of evenly spaced vertical lines on square paper.
You can imagine that lines are bar lines, or beats. Musical time.

Then you record your track on top of it, and you hope that it looks like this:

A sequence of evenly spaced vertical lines, which do not align with the grid of the squared paper.
The lines are off grid. You could drag them a little to the left and they would align again. But…

But in reality, your recorded track looks like this:

A sequence of unevenly spaced vertical lines, which also do not align with the grid.
Look closely. Lines are not spaced evenly. Even worse, the number of lines is different!

Compare it to the original:

The unevenly spaced sequence drawn next to the evenly spaced sequence. Corresponding vertical lines (number 1, number 2 and so on) are connected to show that misalignment grows to the right.
Comparison with the backing track shows that vertical lines are increasingly off grid.

How to fix this? You can’t easily fix a track that has been recorded this way. You would need to cut it into pieces and align each piece individually. I wholeheartedly advise against it. You will never be 100% sure if you aligned each piece correctly. You’ll also destroy any subtle timing properties of the recording. Maybe the musician wanted this phrase to be slightly behind the beat? It’s best to record the track again, sorry.

The solution is to record audio on the same device that plays the backing track.

It’s a subtle problem. Minuscule differences in clock speeds accumulate over time. A 0.02% imperfection in clock speed will be perceptible in a recording. You might think I’m crazy or pedantic saying that clocks of our phones and laptops are that inaccurate. But they really are! We are used to phones and laptops showing accurate time, but this is only because they synchronize time over the network. The source of accurate time is a set of atomic clocks.

If my laptop’s or phone’s clock speed is wobbly, how is it possible to ever record anything in sync? When you’re using a single device for recording, it will still speed up and slow down, but both playback and recording speed up and slow down together, and don’t drift apart.

I hope this article sheds some light on the problem, and that you see how the root of the issue is based in basic physics. I also hope I convinced you that if you want to record, you need to invest some time into learning a simple DAW, or get an audio recorder with the overdub function.

Problems logging into PSN

In the spirit of xkcd 979, let me describe something that recently happened.

My PS4 could not log into PSN, the PlayStation Network. It would show various error codes:


Then I logged out, and tried to log in by inputting the password. I saw a new error code:


I tried logging into PSN from my laptop, from the same home network. I saw:

This page can't be displayed. Contact support for additional information.
The incident ID is: N/A.

Sometimes I’d see a long incident number. It would change with each attempt.

I called the PlayStation support center. They didn’t have any information about the error codes I saw. They checked that my IP address was not blocked. They suggested setting up port forwarding. I thought it would be an odd way of dealing with the logging in problem.

I rang up my ISP. The support person said that there’s a problem with my broadband connection and they’re going to send a technician. The technician detected that some cable outside of my house got rusty and the signal level went down from -1dB to -11dB. The technician fixed this. I waited maybe 3 days and my PS4 started logging into PSN without problems.

So, what was it? Why a weak broadband connection would result in PSN refusing logins from my home network? I don’t know! I could speculate but I have so little information that I would be almost certainly wrong.

Barely tolerable

Some diseases and other problems limit themselves. A parasitic colony must limit the level of exploitation of the host, because a dying host will kill the parasite with it. The colony must limit itself. It can also be limited by circumstances. A sick and hungry host might mean that the parasite will also be hungry and weak.

Today’s computers don’t seem faster than 10, 20, or 30 years ago. I’m sitting in front of a computer with an Intel processor clocked at 2.7GHz, but when I press a key, I’m waiting longer than I did when I used a computer with a Zilog Z80 processor, clocked at 0.0035GHz. This means that the clock is 770 times faster. Why do I have to wait for a reaction longer than earlier? Or, why do I wait more or less exactly as much so that it irritates me a bit, but doesn’t frustrate me to the point where I want to give up using my computer?

Making code changes is annoying, because I can’t just change what I need to. I have to make changes in 8 other places, because different parts of code aren’t fully independent, and layers of abstraction are leaky. I curse, and painstakingly make my changes, hitting one snag after another. Still, the problems I encounter are not as grave as to discourage me from making my changes, or to begin a project to clean it up.

Car drivers are stuck in traffic. The state keeps on building new roads, and makes the existing ones wider, but somehow the traffic is getting worse. A long time ago a journey from A to B was slow, because you had to use a horse and a cart. Then came the automobile, but it was as slow as the horse. Later, a fast car came along, but speed limits and traffic lights also appeared. Other fast cars are in the way, and once you get there, it’s hard to park, which takes additional time.

Extrapolating all this… why do we live in an environment which we can barely stand?

It has to be that way, by definition. If the environment was not bearable, we would have done something to change it. When we accept the environment as is, we let it drift, and that drift has only one direction: to the worse.

Why only to the worse?

A sand castle is only one of many possible configurations of grains of sand. However, from all the possibilities, the orderly ones are a staggering minority. The majority of possible configurations are a gravity-flattened mound of sand without any edges, corners, circles, walls, or any other regular shapes.

From all the possible states of our environment, the majority is sorry, messy, ineffective, not gratifying, and ugly. The number of possible orderly states is much smaller. Thus, when our environment is drifting from the current state into an adjacent random one, it is almost certain that it will transition into a worse state.

There is a higher number of inefficient versions of our computer program, than efficient ones.

There are more ways to cram the city with cars, than to maintain free space on the streets.

Everything around us undergoes something akin to evolution, except without natural selection. The pervasive disorder surges in all aspects of our lives.

We know from experience however, that it’s not always like that. There are some fast computer programs. Some cities aren’t choking on cars. There is computer code that’s pleasant to work with. I’ll ask again, then: why do we live in an environment that is hardly bearable?

Aren’t we guilty of this? Why do we only react instead of working proactively? Why do we wait idly until the environment is intolerable?

Vem pro meu lounge – bass line transcript

Some two years ago, I was listening to a compilation of forró music on YouTube, and one bass line caught my attention. The song was Vem pro meu lounge by Wesley Safadão. Virtually everybody who heard my interest in the song’s bass line, mocked me for it. Apparently Safadão’s music is not held in high regard. But, I was truly impressed that such a rich and improvised bass line could make it into popular music. That could never happen in Europe.

Sadly, the original recording I worked with, has been taken down from YouTube. There is a number of live recordings of the song; they all sound quite close, but since Lourinho improvises the bass line, none of them will be an exact match (this one is quite close).

I’m not 100% sure who the bass player on this recording is, but it’s probably Guilherme Santana (instagram). The playing style matches closely other videos I’ve found online.


Here’s the bass line in the ABC notation for the future generations:

T:Vem pro meu lounge
V:1 bass nm="Bass guitar"
z8 z4 z2 C,,2 |"^intro" D,,8- D,,2A,,2 G,,F,,E,,C,, | D,,12 E,,4 | F,,16 | C,,16 | B,,,16 | %6
D,,8 .D,,.D,,2D,,/D,,/ .E,,.E,,2E,,/E,,/ | F,,16 | C,,8- C,,4- C,,2D,,C,, | %9
B,,,2 z2 B,,2C,2 D,2C,2B,,2A,,2 | G,,G,, z2 z4 z8 | B,,,B,,, z2 z4 z8 | F,,F,, z2 z4 z8 | %13
C,4 z2 F,G, B,A,G,F, D,2G,,2 || F,,G,, z2 F,G, z2 F,,G,, z2 G,F,D,C, | %15
B,,2<B,,,2 C,,D,,F,,G,, .B,,.B,, z B,,- B,,.B,, z C, | F,,4 C,D,F,D, .F,.F, z F, .E,.E,D,=D, | %17
C,D,D=D C.=A,,D,=D, C,.=E,,D,,=D,, C,,2=C,,2 | B,,,2>F,,2 G,,B,,G,,F,, B,,2<B,,,2 D,,=D,,C,,2 | %19
z2 F,,2E,,2D,,2 C,,2 z2 z4 | z16 | z8 z2 z C,, D,,F,,D,,C,, | %22
F,,2 z2 C,C,D,C, .F,.F, z (F, .E,)E,D,=D, | C,D,E,G, C=D,D=E ^ED,DD, CDA,D | %24
=A,D^A,D =A,DG,D F,2D,C, F,F,D,A,, | B,,B,,G,,G,, F,,G,,B,,B,,, z2 z C,, D,,F,,D,,C,, | %26
F,,2 z2 C,C,D,C, F,=A,,E,=E,, D,E,,=D, z | z C, z2 G,,A,,C,C,- C,C,A,,G,, C,,G,,C,,=D,, | %28
D,,2 z2 C,D, z C, D,.F,.D,A,, D,A,,C,F,, | B,,B,,G,, z F,,2B,,,2- B,,,2 z2 C,,D,,F,,G,, | %30
F,,2 z2 C,D,F,D, F,F,E,E, D,D,=D,D, | C,2 z2 B,,2C,2{/C,} D,2C,2B,,2A,,2 | %32
G,,G,, z2 z C,D,D, F,F,D,=D, C,B,,G,,F,, | B,,.B,,C,C, D,D,F,G,{A,} B, z A, z G,2.F,2 | %34
{/G,} A,CG, z F,2D,C, F,F,, z2 .E,,2.D,,2 | z C,,C,, z z G,,A,,C, z C,, z2 C,,D,,F,,D,, | %36
F,,G,, z2 F,G, z2 F,,G,, z2 G,F,D,C, | B,,B,,.G,,2 F,,2B,,,2- B,,,4 C,,D,,F,,D,, | %38
F,,2 z2 C,C,C,C, F,=E,,A,,E,, B,,E,,=C,E,, | C,G, z G, G,,G,,A,,G,, C,C,G,,2 C,G,,C,=C, | %40
B,,G,,F,,B,,,- B,,,F,,G,,B,, B,,,4 D,,=D,,C,,2 | z2 F,,2E,,2D,,2 C,,8 | %42
D,,2 z2 G,,A,, z C, D,F,.D,A,, D,A,,C,A,, | .B,,B,, z B,, D,2F,2 .B,,B,,.A,,A,, .B,,B,, z C,/D,/ | %44
F,.F, z F, z F,C,D, F,F, z F, E,E,D,=D, | C,D,G,A, B,A,G,F, E,D,C,A,, D,C,=A,,F,, | %46
D,,2 z2 G,,A,,C,A,, C,D,2A,, D,A,,C,G,, | B,,F, z F, F,,F,,G,,F,, B,,,2>C,,2 D,,F,,G,,D,, | %48
F,,2 z2 C,C,C,C, .F,2.E,2 D,A,,.=D,2 | C,C,A,, z G,,A,,C,2 z4 G,,A,,C,A,, | %50
C,D, z2 A,,A,, z C, D,F, z A,, D,A,,C,G,, | B,,B,, z B,, G,,F,,B,,,2- B,,,4 C,,D,,F,,G,, | %52
F,,2 z2 C,2 z C, z C,F,,2 F,,G,,D,=D, | C,2 z C, G,G,C,2 z C,G,G, C,G,C,2 | %54
D,2 z D, A,A,A,A, G,G,.F,2 .D,2.C,2 | B,,2 z2 F,,G,,B,,.B,, z2 .B,,B,,- B,,B,,G,,=G,, | %56
F,,2 z2 C,D,F,D, F,F,E,E, D,D,=D,D, | C,2 z C, A,,G,,E,,D,, z2 C,,2C,,2C,,2 | %58
D,,D,, z2 z4 z8 |] %59

The above should work with the online ABC editor.

Subtitles won’t help you

Let’s say you want to improve your understanding of spoken English. Watching movies and TV Series is a great way to do it, but enabling subtitles (in any language) is an insidious habit.

What you think happens: “I read and listen at the same time.”

What really happens: You are only reading the subtitles, while ignoring the sound.

Yes, you might hear the sound, but you are not listening to it. Your attention is spread too thinly. If you’re learning English, processing English text consumes a hefty chunk of your attention. There’s little left for listening.

You might say, “well, I will only listen. I will only look at them when I need to.” Unfortunately, this doesn’t work either. First, it’s hard to avoid reading text. You need to actively suppress the urge to read, and if you lose attention for a second, you’ll find yourself reading again. But let’s suppose that you manage to ignore the subtitles, and you’re listening again. In general, when we’re listening to speech, understanding can come after a delay of one or two seconds. Until that time, you can’t be sure whether you understood or not. When two seconds pass and you notice that you didn’t understand what has been just said, the subtitle you need is already gone. (You get frustrated and go back to reading.)

What do I suggest instead? Try as follows: disable subtitles, focus hard (yes, it’s supposed to be an effort!), and see how much you understand. If you miss a sentence or two every now and then, don’t worry. If you have a feeling that you missed something important, rewind and try listening to it a few more times. If you’ve listened to it 10 times and you still don’t understand it, check the subtitles, but disable them again afterwards.

If you miss so much that you lose track of the story and/or start zoning out, that’s a signal that what you’re watching is too challenging. Try looking for something simpler, and perhaps shorter. You need to find materials that require a bit of effort, not too little, not too much ‒ it’s up to you to gauge it. Make it progression, once you’ve worked out a simpler one, continue to a more challenging one. If you feel frustrated, go back to an easier one. Bounce back and forth as much as you like.

For example, I find 2-3 minute long comedy sketches excellent for my Portuguese learning. They’re funny, I can analyze them in full, and they do present a challenge. Sometimes it’s really good to focus on a short piece of content, and work out every little bit of it. Then I go back watching something longer, like Disney movies.

P.S. If you’re using subtitles in your native language, it’s even worse, because your brain is confused as to which language it’s supposed to operate in, and most likely locks on the non-English language, shutting English out from your attention.

Flashcards for vocabulary

I’ve learned Portuguese mainly through conversations. They weren’t exactly common conversations: first, how to talk when you don’t know the language well? Second, how do you learn vocabulary in a conversation? Third, how do you further develop your language, are conversations enough?

My answers:

  1. I used to spend 10-15 minutes in preparation to a 1 hour conversation. I would come up with a rough idea for a topic, and prepare a list of words I wanted to use.
  2. Whenever I encountered a new word, I wrote it down. Initially in a copybook, and later in my phone, in a simple note taking app. I did it assiduously and patiently. These notes were the only tangible thing left after my conversations were over.
  3. Mastering a language doesn’t only happen through talking, you also need reading and writing. And to develop your reading and writing, you need to read and write.

This year, I’ve identified two areas for development:

  1. Extend my vocabulary. When talking to people, I am able to get by, but limited vocabulary makes me feel that I’m repeating over and over again the same set of words, phrases, and even topics. Limited vocabulary is especially afflicting when I’m reading articles or graphic novels. Books are still a long way away.
  2. Learn more sentence templates. They are a great tool, allowing me to just plug in appropriate words to say what I want. Such templates aren’t trivial to pick up from speech. Comprehension from listening doesn’t imply the ability to produce speech. When I hear an interesting turn of phrase, I can, in principle, ask the other person to repeat it to me, but doing so painfully breaks the flow of conversation. I tend not to do that often.

None of these two things comes on its own; at least not enough. They require planned effort.

Starting from the second point, learning sentence templates is easy from written word, especially from books with dialogues. But, to absorb written word, you have to first know what this word means! This means that the vocabulary issue has to be solved first. Once I have enough vocabulary, I’ll start reading more, and pick up the templates I need.

I returned to my vocabulary notes.


Browsing word lists is clearly better than nothing, but my lists weren’t well organized and they weren’t easy to follow. Surely there must be a way of organizing your word lists on paper, but instead, a colleague has shown me AnkiDroid.

After a quick look around the app I learned writing down new words as quickly as in the previous note taking app.

The main difference is that AnkiDroid shows you a notification, that you have flashcards to review. I’ve replaced the Facebook shortcut with AnkiDroid. I continued clicking the same spot, but instead of wasting time, I’ve started learning vocabulary. Success!

But there is a wrinkle.

Learning words from flashcards isn’t the same thing as learning words in a conversation.

Listening, speaking, reading and writing are four separate language skills. Vocabulary from one of them doesn’t necessary transfer to other three. In a sense, we have to learn each word four times. The first time, to recognize it in speech. The second time, to pronounce it correctly.

A digression, I once knew a guy who would say for example “I been thinking”, because he learned English from talking to people, and he never paid attention to detail, and probably hardly read anything. The “ve” ending escaped him. That is, it would escape him every day, three hundred times a day.

The third time you need to learn a word to recognize it in text. The fourth time, to write it down, and spell it correctly.

Where is the difference between words from flashcards and from conversations?

Firstly, in the context, or rather lack thereof.

When I encounter a new word in a conversation, it’s always in a specific context: in a specific place, in a specific company, in a specific situation, sometimes even with a specific problem to be solved. Learning a new word that way I don’t even need to repeat it many times. The use of this word in this specific situation means, that when a similar situation arises again, the word comes back to me as part of this context.

Words learned that way feel very different one from another. There’s a story behind each of them, every word feels different, carries a different mood, color, sometimes even a taste or a smell.

Flashcards on the other hand, have a rather humdrum existence. There’s just me, staring at my phone’s screen. I review 100-120 flashcards daily, so words have little chance do differentiate between each other.

Secondly, that in flashcards, I’m pinning a Portuguese word to a corresponding Polish or English word, and not to the underlying concept. For example, when I see the word “cerca”, I have the appropriate image in my head in milliseconds. I see a human-height wooden or metal divide, surrounding a garden. I see the image and I know the concept, and not the corresponding words for it in other languages. Unfortunately, AnkiDroid is not able to read my thoughts, it can only show me the other side of the flashcard, which contains the corresponding word in a different language. I have to think: this wooden divide around a garden, what is it called in Polish? Ah, “płot”! And this additional step takes me a second or two.

An additional difficulty is that I’m trying to mentally separate languages from each other. When I’m using one language, I’m focusing on it, and I’m trying to filter other languages away. (This sometimes causes confusion when somebody unexpectedly says something in a different language, even if it’s my mother tongue.) When I’m studying the flashcards, I’m forced to switch between languages, while I’d rather not do that. Theoretically, I could prepare flashcards with pictures, but who has time for that?

Thirdly, I’m effectively training myself to associate words from different languages. That is, starting from a word in one language, I can find the corresponding word from another language. Unfortunately, this is not what I need when I converse! When speaking, I need to quickly find words that match the concepts I’m holding in my mind. If I want to use a word I’ve learned from a flashcard, I have to first find the correct word in a different language. Then I have to use the skill I learned from flashcards: find the corresponding word in the other language. This is too a roundabout way for me to do it fluently. I stammer.

I’m hoping that with time the most used words will find their way into appropriate contexts, and I won’t have to do the two-language round trip to fetch them.

What’s the benefit of flashcards then? It’s in the number of words I can cram into my head in limited time.


During the past month, spending roughly 15 minutes a day, I’ve learned ~50 words well, and another ~170 words to a reasonable degree.

Five Minute Breaks

About two weeks ago, I posted this in a few places:

Dear Lazy Plus,

For a few days now I’ve started to take regular, 5-minute breaks during work. They are literally 5-minute breaks, with a timer running on my phone. I make a cup of tea, take a few sips, and di-ding! I’m heading back to my desk.

Thing is, there’s only so many times making tea is enjoyable. Or any beverage, for that matter.

I’m looking for something else to do in these 5-minute breaks. Ideally, something that isn’t staring at my phone, or involving any kind of screen. One thing I tried is juggling 3 balls. I’m not very good at it, which is great, because I feel that I’m improving slightly. But I’d like to add more activities. Any ideas?

Here’s the compilation of answers I’ve received:

  • Music room [I play bass]
  • Take a 5 minute nap on a bean bag
  • Doodle
  • Stairs
  • Rubik’s cube
  • Meditation (breathing, etc)
  • Do nothing, just relax, maybe reflect on what you did since the last break and why what you’re going to do next matters
  • Juggling a squash ball [I was offered instructions on how to do it]
  • Basic wrist / arm stretches
  • Go strike a conversation with someone on another floor
  • Origami. Yodeling. Bonsai. Learning to play the xylophone. Chemistry experiments. Psychological experiments.
  • Push-ups. Skipping rope. Chin-ups. Hold a pillar bridge/plank for 5 mins. Balance things on your head. Meditate. Journal.
  • Take up smoking [plus a suggestion of a smoking companion, haha]
  • Do two things with a Mobius strip. Do one thing with a piece of knot theory.
  • Deep Diaphragmatic Breathing, five minutes of that a day will revolutionize your life.
  • Work on doing impressions.

So far, the easiest thing to do, was striking a conversation with whomever happened to be around the micro kitchen. I’ve talked to several people who I’ve just been passing in the corridor without even a greeting. Now I know their names and a little bit about them. It was satisfying. It will take me time to try out the other ones.

phpBB static archive

I looked online for instructions on how to create a static phpBB archive of
a retired forum, and didn’t find much, apart from other people asking the same thing. I’ve investigated it myself.

UPDATE 2016-05-09: New things I found: How to archive phpBB (similar writeup), and phpbb3-static (a converter script).

UPDATE 2016-11-28: I’ve decided to do it again, better, using phpbb3-static.

General options

When choosing your approach, one of the criteria is the future maintenance cost. It’s likely that the reason that you want a static archive is that you want it to not require maintenance, or require as little as possible.

Optoion 1: Lock the forum and continue to run phpBB

  • Pros:
    • There’s little to do, so it’s quick.
  • Cons:
    • High maintenance. It’s not static. You’re still running PHP, so you have to keep on upgrading your PHP installation and your phpBB installation, or your forum archive will get hacked.

Option 2: Download the whole forum using wget or httrack

  • Pros:
    • The result looks the same as the original.
  • Cons:
    • The result looks the same as the original. (e.g. hard to browse on phones)
    • Out of the box, it does not work! It requires tweaks as discussed below.
    • Lots of content duplication. If there are different URLs with the same content, they will exist as separate files on disk.

Optoion 3: Write your own exporter

Query the database with SQL and write the output the way you want it.

  • Pros:
    • Low maintenance of the resulting site.
    • High level of control of how the output is structured.
  • Cons:
    • Writing the exporter is time consuming.
    • The output will most likely look different from the original forum, so people used to the forum who are browsing it will be likely confused about the navigation.
    • You need to put in additional work to preserve the old URLs.

Also… you could even generate a set of Markdown files to be fed as input to a static website generator such as hugo. This would give you a lot of things for free, including nice URLs and a sitemap.

Option 4: Use an existing exporter

  • Pros:
    • Low maintenance result.
    • Takes less time than Option 3, with comparable results.
  • Cons:
    • You can’t expect the exporter to just work for you, especially if you’ve modified / heavily customized your forum. You will have to dig into the exporter script and fix issues in the (somebody else’s) code.
      Archiving a forum is a one-off job. Once the result is satisfying, the user will lose interest in the exporter and will most likely not improve it any further. When you pick up an exporter, you’ll pick it up where the previous user left off.

Post content / bbcode

From my experience proper processing of the post content is the hardest problem. This is due to the format that phpBB uses to store posts in the database.

You would think that there is just one syntax – the one that forum users enter, which is stored in the database, and rendered into HTML when served on the web. In the case of phpBB it is not so: there are 3 formats! One for the user to edit, one to display (HTML) and something intermediate, that is stored in the database.

The existing exporter I found, phpbb3-static, used an existing bbcode parser to transform the database contents into HTML. The problem is that the database content isn’t bbcode, or at least it isn’t pure bbcode.

It’s a mix of HTML containing raw <a href=”…”>…</a> links, with bbcode links (“[]bbcode links[/url]”), and the existing bbcode parser tries to linkify bare URLs that it spots in the content. If there’s something like this in the content…


…the end result is (indentation added for readability)…

<a href="$valid_url">
  <a href="$truncated_url">

…and that doesn’t work, because $truncated_url is… truncated. This is what phpBB does with link links by default: It shortens turns “longlonglonglink” into “lo…nk”. The first part still starts with “http://” so the bare link matcher catches it and adds a <a href=”…”></a> tag around it.

I examined the database representation and realized that it’s complex and improving the parser on my own is futile, and in the best case I would be merely reimplementing what has already been implemented in phpBB itself. Perhaps I could just call the generate_text_for_display() function from phpBB to render the HTML? Theoretically yes. Unfortunately, this function isn’t just a parser. It uses a number of global variables, such as $user and $cache. The $cache is used to access the forum configuration, and makes SQL queries. In result, what should be just a text parser, requires the full phpBB environment.

I could wire the exporter to phpBB, but I thought that it would make it dependent on a certain phpBB version. What I could do instead, is making a HTTP request to the live version of the forum, finding the right snippet of HTML and saving it.

I’ve tried it. This method was order of magnitude slower than in-process parsing. But on the positive side, it gave me the right results!



[Obsolete] The previous attempt, using wget

Left here for the record. Superseded by the above approach, using phpbb3-static.

I’m intentionally not trying to write the whole thing in a form of a script, even though it was tempting. I expect different phpBB installations to vary, and the chance that my script would work with somebody else’s forum is slim. So instead I’ll write up what I did step by step, and people can follow this howto and make alterations as they see fit.

Note: I’m using Apache and I’m quoting Apache specific configuration lines.

Mirroring the forum

I downloaded the database and the forum snapshot to a local computer to start a local instance. It’s a hassle but it makes things quicker. Once it was ready, I created a mirror on disk:

wget --mirror -k -p <Forum URL>

After downloading it turned out that I had 127 thousand files on disk, which takes up 5GB of space as shown by du -sh <directory>. I mean I’ve seen larger in my career, but I expected a smaller size from a generally text-based static forum archive.

I’ve put result of wget’s work on a test server to see how it works.

Question marks

During testing it turned out that the “?” in the URL is treated as a special character. For example, when the browser requests this:

GET /style.php?id=1 HTTP/1.1

…the WWW server is looking for a file on disk named style.php, fails to find it, and returns a HTTP 404 error.

HTTP 404: style.php not found

But in our case we want the server to serve the file named “style.php?id=1”!

$ ls -l style.php*
-rw-rw-r-- 1 maciej maciej 71445 Apr 24 15:58 style.php?id=1&lang=pl
-rw-rw-r-- 1 maciej maciej 71445 Apr 24 16:24 style.php?id=1&lang=pl&sid=2231c9b38ea28f9aa9e9bdd2a8452846

By the way, did you noticed the file with sid? Ugh. Anyway…

With help from StackOverflow I’ve found these magic lines that I added to .htaccess:

RewriteCond %{ENV:REDIRECT_STATUS} !200 
RewriteCond %{QUERY_STRING} !^$ 
RewriteRule ^(.*)$ %{REQUEST_URI}\%3F%{QUERY_STRING} [noescape,last,qsdiscard]

I don’t fully understand what it does, but it seems to work. As far as I could understand — when the query string is not empty (“?foo=bar” in the URL), the request is rewritten in such a way that we’re putting it together again using REQUEST_URI and QUERY_STRING, and we’re connecting them with “%3F” which is an urlencoded question mark. When this is done, Apache understands that we mean a “?” on disk, and not a url/query string combination. We also have to add “qsdiscard” to prevent Apache from appending the query string again onto the URL. In a way, Apache is trying to do the right thing: keeping the file part and the query string part of the URL meaningful and separate. But in this case we want to do something opposite: treat the “?” literally as a part of file name.

By the way, the solution I found on StackOverflow was slightly different and didn’t work for me verbatim.

Done-ish? Probably not

OK, so this is the rudimentary version of the archive. It has a number of disadvantages, but it meets the main criteria: we have static files and the content is there, you can browse it.

What are the problems?

  1. The login form and the search box are is still there, which is confusing for people, they will try to log in and wonder why it’s broken.
    Addressed below.
  2. A number of URLs won’t work. There is a number of reasons for this, one of them is the parameter ordering. The web server isn’t interpreting the query strings any more, so these two are different now:

    In the PHP world they were interpreted and became part of the URL parameter namespace regardless of the order, but now Apache is just looking for files on disk, and it just looks for files named exactly as specified in the URL. So some URLs that used to work, especially if somebody linked to your forum  from the outside, will not work.

    Not addressed as of 2016-05-05.

  3. URLs are ugly. I know that search engines can deal with this sort of stuff, and they can do things like filtering out the “sid” parameter from the URL. But still, I keep on thinking that the forum URLs should be more like:

    Not addressed as of 2016-05-05.

  4. No sitemap.Not addressed as of 2016-05-05.
  5. Not mobile friendly. This isn’t a problem with the archiving process per se, but it is a feature I would expect in a good archive.Not addressed as of 2016-05-05.

Login form and the search box.

The next thing I noticed is that there still is a login form in the HTML. It is confusing for people because there’s nothing indicating that there’s nothing to log into. I wanted to remove the form, but it was duplicated across 127 thousand files!

First I tested it on one file:

sed -i -e '/<div id="search-box">$/,+9d' viewtopic.php?…

And then ran across all files:

find . -name '*.php*' -exec sed -i -e '/<div id="search-box">$/,+9d' {} \;

This took a fair bit of time, but was successful. I actually don’t know how much because I went out for a small hike.

Let’s make it smaller

The reason why the forum occupies a large amount of disk space is that a small file still occupies a full block on disk, so there’s a sort of file count tax that you have to pay when storing files on disk. But there’s something that you can do. I realized that the forum archive is static, so I can use a read-only file system, and there are read-only file system which pack files efficiently. After a quick look around, SquashFS turned up as the best pick, with efficient file packing, compression, and support in the Linux kernel. The whole packed forum shrinked from 5G to 517MB. I mounted it using the loopback device on the web server (added it to /etc/fstab), and voila! Almost 10× reduction in size. My web server only has 20G of disk space, so saving 4.5G is significant.

Unresolved problems

At the time of writing there’s a number of problems I haven’t addressed in my forum archive. If I manage to, I’ll update this page with new information.