PatrykRz – Patryk Rzadzinski

Pushing the envelope in computing

Posted on May 1, 2025May 1, 2025 by PatrykRz

In case you thought LLMs will solve all our problems and we can all enjoy the back seat now, because that’s not how it works. LLMs will simply expand a prompt with what’s likely. Or to be more precise: what’s likely a human has once written. Simple as that. AI is great for finding correlations in setups consisting of multiple factors (where the magnitude is overwhelming for a human). LLMs are great in digitalizing certain workload or synthetising content. But unless somethign has been well documented, LLMs won’t tell you X is wrong and you should do Y instead. We still need engineers.

Luckily, there’s entrepreneurship – the ability to find a challenge and, driven by good greed, offer a solution for it, in the form of a product. These days, the solution often is delivered through technology (and such is the definition of this word: asking the right questions and finding the most elegant answers). Sometimes the product itself is not technically relevant, but once the adoption of the product reaches millions, the enterprise can afford some great engineering at a scale.

Such is the case of ByteDance. A bit unusual for me to write about this, as I have nothing positive to say about their product and personally prefer to read text. I sigh whenever interesting content is only available as video (toss of a coin chance that it is viewable in 2x speed mode). Yet, TikTok is used by millions of customers and such scale allows the company to hire some great engineers, looking for some great solutions.

What have ByteDance engineers proposed? An improvement to IPC! IPC stands for inter-process communications and it is a POSIX standard adopted by Linux, Windows and many other operating systems. What’s interesting is that the standardisation process started in 1988, so quite a while ago and there’s usually little chance something so mature can be further improved, right?

Wrong. ByteDance engineers discovered that instead of using the POSIX available interfaces such as signals, sockets, shared memory or threads, processes can also communicate as function calls, similarly how a compiled program binary calls linked libraries. The solution is called RPAL, for “Run Process as Library” and allows avoiding duplicate memory allocation and involving data and control planes each time a communication shall occur. They have proposed this in a Linux kernel patch.

The results? For an unit testing type of scenario, the engineers are reporting 91.3% improvement over CPU cycles required. In a more realistic data center scenario, on a micro-service architecture, the improvement is an outstanding 15.5% improvement, which is truly remarkable.

What does it mean for the rest of the world? This improvement is likely to be merged into the Linux mainline, and as you might know, almost everything runs on Linux these days. This means a problem solved for TikTok will soon let any data center run more efficiently, including any app in the cloud, anywhere in the world, any business sector. It still won’t change my view on TikTok though 😉

Automatically add ICS files to Google Calendar

Posted on April 12, 2025May 16, 2025 by PatrykRz

If your preferred email client is not one of the popular choices, for example you use roundcube, Nextcloud or Snappy mail, you probably have come across a situation where email invitations to web meetings arrive as email attachments. What’s then most convenient is to be able to click on an attachment and have it automatically appear in Google’s calendar. This guide explains how to achieve this on a Linux desktop, as well as in Android.

The attachment is a file with an ICS or iCal extension is a set of meeting’s metadata. It is formatted and known as one of the MIME type files. This allows desktops to assign a handling application whenever such file is to be opened. For Android, the operation is easy, since there’s a Google calendar app either preinstalled or easily installed, which can be associated with the file (use the tripple-dot button on the file, then “Open with” and select the calendar app).

The challenge is a bit harder on the desktop, where there’s no Google app available. There usually are, depending on the choice of the desktop manager, calendar applications which could integrate with Google’s calendar, however that is not always the case. In such situations, there’s a good solution called gcalcli.

It can be easily installed in user mode with a few trivial steps (such as creating a python virtual environment). Since the Google Calendar is available through a RESTful API behind OAuth authentication, there is one prerequisite described in this guide. Following it allows gcalcli to securely call Google Calendar’s API to create events using the import command.

Eventually all that’s left to do is to write an oneliner script which can be tied to the desktop’s ICS MIME type, which will then run gcalcli with the right arguments and parameters. Here’s an example:

patryk@ryba ~ % cat gcal-venv/bin/importer.sh 
#!/bin/zsh
#
/home/patryk/gcal-venv/bin/gcalcli import --calendar='Patryk Rz.' "${1}"

Don’t forget to make importer.sh executable with chmod.

I use XFCE as my desktop manager, therefore my ICS file association looks like this:

And that’s all. Clicking on any file with the .ics extension will have my desktop call the importer.sh script, which will invoke gcalcli from it’s venv with the right parameters. Gcalcli will then call Google’s API to create the event will all the details as the inviter intended. Enjoy.

Windows-like sound in a Linux desktop – or better!

Posted on April 8, 2025June 18, 2025 by PatrykRz

While the sound architecture in Windows desktops is far from ideal, it’s still one of the most comfortable setups, at least from the user experience perspective. Most of the time it works and the default setup is suitable for vast majority of situations.

Meanwhile, ALSA in Linux has really made good progress. Most of the time, it does not require any configuration to deliver sound to desktop applications right after installing your favourite distribution. It almost levels with Windows from desktop experience perspective.

The difference comes when you want to do something even slightly unusual. On Windows it is very hard, users are down to finding some third party apps which might or might not do the job.

If you however run Linux and on its altar make an offering of your time and a prayer over the ALSA configuration, everything is possible. What’s more, Google’s search AI provides working examples (of course, bugged, but still somewhat useful), therefore within 1h I was able to get what I needed: sound simultaneously sent to multiple devices, adding a software volume where it is missing and mixing sound outputs.

In this article I will explain a configuration of 2 or more devices. Below are the 3 devices detected by ALSA on my desktop.

1st sound device – an intel chip on the motherboard. This one could be connected to a set of speakers, for example. I’m not using it right now.
2nd sound device – an audio chip onboard of my nVidia card, with a HDMI cable connected to my monitor, which has a built in speaker. Good whenever I want my ears to rest and get the sound from this speaker.
3rd device – Fiio USB-bluetooth adapter. It presents itself as an USB audio device, and has a transceiver for bluetooth to communicate with my headset

In a Windows-like setup, you’d like to simply enable 2 or 3 devices and switch between them within a few mouse clicks. As it turns out in this setup, the Linux version requires one click less 😉 Also, I’ve decided to cut out all the overhead, so there’s no pulseaudio, JACK or any other extras, just plain ALSA with minimal configuration. Let’s begin.

First, kernel drivers are needed so that ALSA can recognize the sound devices. This step can be skipped unless you manage your own kernel config. In my case, the typical drivers are needed:

sound card support
snd-hda-intel for intel HDA and HDMI devices
CONFIG_SND_USB for any sound cards connected through USB, including some usb-bluetooth hubs like the Fiio.

The alsa-utils package provides a simple client to play audio files called aplay. It also allows listing sound devices, which is helpful to prepare the ALSA configuration. It will show devices here if the kernel drivers are compiled and loaded (if compiled as modules). For me it looks like this:

fst@ryba ~ % aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: PCH [HDA Intel PCH], device 0: Generic Analog [Generic Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 0: PCH [HDA Intel PCH], device 1: Generic Digital [Generic Digital]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: NVidia [HDA NVidia], device 3: HDMI 0 [ROG PG27AQ] <-- my monitor
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: NVidia [HDA NVidia], device 7: HDMI 1 [HDMI 1]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: NVidia [HDA NVidia], device 8: HDMI 2 [HDMI 2]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: NVidia [HDA NVidia], device 9: HDMI 3 [HDMI 3]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 2: PRO [FiiO BTA30 PRO], device 0: USB Audio [USB Audio] <-- usb/bluetooth
  Subdevices: 1/1
  Subdevice #0: subdevice #0

What’s worth noting from the above:

card 0 remains unused. You can use it by declaring slave devices under hw0,0 and hw0,1 for analog and digital respectively.
card 1 is the nVidia oncard sound chip connected to the speaker built into my monitor
card 2 is the USB device

And here’s the /etc/asound.conf config file. You can also declare your config locally. As per the documentation, changing the config does not require restarting the ALSA service. Only restart the sound client such as mplayer to observe the changes.

The key to understanding this config is “stacking”. ALSA by default will pick a device and it will work unless you tell it to work in a specific way. This way is building stacks, like:

define a hardware device using type hw
connect a sound mixer to this hw device using dmix (needed when you want to play back more than 1 thing at a time)
optionally apply a software volume mixer using softvol
combine output of dmix/softvol to multiple sound cards using multi
point the default device to point 4

# A PCM to mix sound output and send it to the HDMI hardware device
# only required settings are used,
# period_size specifies the amount of bytes consumed per interrupt. It
# requires a buffer_size larger than it to avoid sound stutter
# ipc_key is used to set an unique identifier of the device
pcm.dmixer_hdmi {
	type dmix
	ipc_key 2048
	slave {
		pcm "hw:1,3" 
		period_size 1024
		buffer_size 4096
	}
	bindings {
		0 0
		1 1
	}
}

# Per analogiam, mix and send output to the USB sound card
pcm.dmixer_fiio {
	type dmix
	ipc_key 4096
	slave {
		pcm "hw:2,0"
		buffer_size 4096
		period_size 1024
	}
	bindings {
		0 0
		1 1
	}
}

# HDMI comes with hardware control in the monitor.
# This software volume control is plugged on top of 
# dmix and allows applying a volume control to it.
pcm.hdmi_sv {
	type softvol
	slave.pcm dmixer_hdmi
	control.name hdmi_volume
	control.card 1
}

# finally a device to send the same sound as output 
# to the selected sound cards
pcm.all {
	type plug
	slave.pcm {
		type multi
		slaves {
			a { channels 2 pcm "dmixer_fiio" }
			b { channels 2 pcm "hdmi_sv" }
		}
		bindings {
			0 { slave a channel 0 }
			1 { slave a channel 1 }
			2 { slave b channel 0 }
			3 { slave b channel 1 }
		}
	}
	ttable [
		[ 1 0 1 0 ]
		[ 0 1 0 1 ]
	]
}

# Instruct ALSA to use the "all" PCM whenever an app on the desktop
# needs to provide sound, as a default device. The stack is
# default->all->plug->multi->(softvol)->dmix->hw
pcm.!default {
	type plug
	slave.pcm "all"
}

# some apps don't use the PCM interface, chosing CTL instead
# such as alsamixer. This allows opening the usb device by default
# the '!' char replaces default configuration with creating the
# one specified in this config
ctl.!default {
	type hw
	card 2
}

That’s all. Switching between the devices can be achieved simply using a mixer app, or alsamixer in the terminal. For XFCE, I use xfce-mixer, but there are mixers also for Gnome or KDE.

Fun fact at the end. Before publishing this guide, which I hope will help you get familiar with ALSA, I did of course try Gemini AI to see if it would produce a reasonable configuration set. The outcome is very interesting. While the config itself was not bad in general, it had one bug that made it useless. Namely, the ipc_key is supposed to be an unique identifier, but the config provided had the same value for each device, which obviously will never work and will be hard to find out why. Maybe if this requirement (why is it even a configurable item, dear ALSA devs?) was more clearly defined in the docs on which Google’s model was trained wouldn’t result in such buggy answer?

Carbon is the mind-killer

Posted on May 14, 2022August 19, 2023 by PatrykRz

I couldn’t help myself but start with this paraphrase. I’ll try to prove a point that just like fear, carbon has stolen the spotlight leaving many other important aspects in the dark. And as of “Dune”, ecology is one of its main themes in what I consider to be its true holistic meaning, so very much in line with the popular ESG topics.

As the leader of Future Technology at Accenture, I was invited to a conference devoted to ESG in technology. I was asked to provide an opinion about how our technology offering aligns with the ESG principles. At first I was baffled. My team and I focus on, optimize, rearrange, reorganize solutions for our clients on a daily basis. We aim to provide more elegant, more robust technology, helping companies earn more money, save time and become more efficient. We hardly ever looked at the ESG aspect of it. Of all sectors, that is really someone else’s problem (honestly, would ESG be publicly debated if all power plants and factories around the world only used green energy?).

My interlocutor helped me out here – “what we’re after is ensuring cloud technologies are applied wherever possible”. At first this made sense to me. OK, the same workload in the cloud is going to benefit from all the economies of scale, sharing hardware resources and scaling as needed, using the most modern and energy efficient infrastructure. Utilizing scaled cooling, submersed servers or even whole datacenters must be more efficient than cooling each chip separately. But how beneficial is it really? What fraction of the world’s carbon problem are we fixing? And cost wise – assume the price tag for a given cloud migration is $3 million. So, imagine having this amount – would you decide to spend it on a cloud migration if your sole purpose is to reduce carbon emissions? I am going to guess there are more efficient ways of allocating such money.

And then the really important thought came – why are we still talking about carbon emissions only? There are 3 pillars in ESG, and carbon related topics belong to the E(nvironment) pillar. But is there nothing technology can achieve across Social or Governance?

Short answer is: yes it can. Take open source for example. It is way more sustainable than closed solutions. Every day, by leveraging open source solutions such as kubernetes, docker, Apache Spark and many more, we take and we contribute. With tens of thousands of people doing alike, the technology we’re building is becoming more universal. With every contribution we make, it becomes a bit better to everyone else and easier for others to approach and develop. And with wide access to the code, it is available to anyone willing to join and get their hands dirty, at almost no cost. I suppose the term “open” really fits well here.

Let’s see how this compares to closed solutions. I’ve mostly found out from the perspective of projects migrating out of such solutions. The reasons would vary. In most cases it was the lack of flexibility, making adjustment to changing market needs. But the list of issues is actually quite long: knowledge and expertise was hard to get, the code had many mysteries and surprises, the logic is hard to fathom, reverse engineering is close to black magic, the policy used by the authors is not always in line with the clients’, and so on. And if for some reason an engineer wanted to master this tech, they would first need to study it through expensive courses (if (still) available).

Final thoughts. I got my peace of mind, we’re champions of ESG 😉 Not because what we build poofs less carbon, but because our solutions are sustainable. Companies using open source make a strong bet on something that will either evolve steadily and remain available, or be easily replaceable (e.g., by a fork). They will find talent easier, their IT strategy will have continuity and remain easily upgradeable.

And if that’s not enough, the same companies can save on migrations from closed solution A to closed solution B or the royalty licenses. And the sad lesson? Fighting carbon emissions has overshadowed many other important topics.

Migration to wildcard letsencrypt certificate for all services

Posted on April 5, 2020April 5, 2020 by PatrykRz

I have been running on a wildcard certificate for many years on my server. While there are concerns that these are less secure because they violate the rule of not keeping all eggs in one basket, the main reason for my migration was cost (and I prefer to stick to the wildcard for convenience).

Lets Encrypt started providing wildcard certificates about a year ago. The only issue is it requires DNS validation for authorization. I keep my domains at Gandi and while they have a DNS API, they stopped maintaining the module to favor their free certificates for whoever uses their hosting. Since I don’t, I had to tinker a little bit to get letsencrypt work for my wildcard *.domain.com and in this article I will explain how.

To be able to use the DNS API, an API key is required. It can be obtained from the Gandi account’s security section. I have comitted mine to /etc/gandi.ini, the content is as follows:

root@mydomain# cat /etc/gandi.ini certbot_plugin_gandi:dns_api_key=whatever

This will allow the gandi-plugin for certbot to create a TXT record with some required value, which will then be verified by the CA and cleaned up automatically.

In the next step I am adding the unsupported plugin to the certbot docker image by creating the following Dockerfile

root@mydomain# cat Dockerfile
FROM certbot/certbot
RUN pip install --no-cache-dir certbot-plugin-gandi

Followed by building the new image:

root@mydomain# docker build -t letsencrypt-gandi .
root@mydomain# letsencrypt-gandi # docker images REPOSITORY TAG IMAGE ID CREATED SIZE letsencrypt-gandi latest 9a4bc7f0212e 2 hours ago 158MB

In the next step, authenticate and generate the certificates. They will only be valid for a short time, so they need to be regularly renewed.

root@mydomain# cat letsencrypt.sh
#!/bin/zsh
docker run -it --rm \ --volume "/etc/letsencrypt:/etc/letsencrypt" \ --volume "/var/lib/letsencrypt:/var/lib/letsencrypt" \ --volume "/etc/gandi.ini:/etc/gandi.ini":ro \ --name=certbot \ letsencrypt-gandi certonly -a certbot-plugin-gandi:dns --certbot-plugin-gandi:dns-credentials /etc/gandi.ini --server https://acme-v02.api.letsencrypt.org/directory -d \*.domain.com

Note the mandatory escaping of the asterisk.

Finally, the docker script which needs to be added to cron to auto-renew the certificates:

docker run -it --rm \ --volume "/etc/letsencrypt:/etc/letsencrypt" \ --volume "/var/lib/letsencrypt:/var/lib/letsencrypt" \ --volume "/etc/gandi.ini:/etc/gandi.ini":ro \ --name=certbot \ letsencrypt-gandi renew -q -a certbot-plugin-gandi:dns --certbot-plugin-gandi:dns-credentials /etc/gandi.ini --server https://acme-v02.api.letsencrypt.org/directory

After those steps, changing paths to the new private keys and certificate bundles is required for all services. Here’s what I had to do:

Dovecot – nice and easy
Nextcloud – see nginx
exim – needs copying the key/cert pair to some other directory where it can access it as the user it runs with, chmod 400 is sufficient however be mindful you need +x to traverse directories and +r to list their contents.
nginx – massive replacement of the key/cert paths can be done by moving them to a separate file under /etc/nginx/conf.d/ and loaded as a module
rainloop – must have the “verify certificate” option disabled, not sure why, but it’s marked as “unstable” by the vendor anyway.

That’s it – hope it helps!

More performance from virtualization?

Posted on April 18, 2019 by PatrykRz

I would be highly interested in opinions in the below idea. Big thanks to everyone who already shared their feedback with me on it. I’ve decided to write about the idea here in more detail to collect more feedback and hopefully learn more about the details of spreading workload across virtual nodes.

Back in 2005, Intel released the first common x86-based CPU with a virtualization instruction set called VT-x, which soon, through wide adoption became the de facto standard for desktop and server virtualization. While it is not a challenge to imagine how this works, in reality, this is simply a set of instructions which offer control (launch, resume, stop) isolated contexts/domains on a single CPU. With the later addition of EPT and VT-d, virtual machines can also get shared memory between the host and guests and gain direct access to PCI through IOMMU.

While there are many use cases for virtualization, such as isolation, simulation and resource use optimization, in this case I would like to focus purely on performance gained from direct hardware access.

Running a program on a host machine will, in big abbreviation, allocate local memory as required and perform operations invoking the CPU as well as I/O as required for moving data between memory and the CPU, optionally use persistent storage such as hard drives. In order for the program to operate, it will have to wait for the software scheduler and IO scheduler to allow execution, wait for ACPI IRQ if required, and hit the CPU cycle with the operations. Subsequently, there can be optimistic or pessimistic scenarios in this operation, affecting the performance.

Various techniques exist to allow optimizing the load, to gain more performance, especially in set ups where sufficient resources are available. Many programs spawn threads, works or simultaneous jobs which offer better results (for many reasons) when compared with running in a single, monolithic process. Nevertheless all of these still go through the same path before hardware resources actually process low-level instructions – schedulers, cycles, etc.

And this leads to the idea: could there be a new (?) technique, allowing running multiple jobs that’d utilize VT extensions? I imagine it will not be applicable in all cases and should be left to the developer or maintainer of given program to decide. For example, I can’t imagine applying this to a workload with a lot of mutexes or other need of inter-process communication requirements. But take for example the make program – running multiple -j jobs in separate virtual contexts could give them performance gains from direct I/O and better use of CPU cycles perhaps?

I have joined Plutus.it

Posted on December 13, 2018 by PatrykRz

But what is Plutus? Aside from being the Greek god of wealth, it is a fresh approach to crypto-fiat gateways and building financial services on top of that. And much more – as per the whitepaper:

The Plutus Mobile Application enables a user to make contactless Bitcoin payments at any merchant with a Near Field Communication (NFC) enabled checkout terminal. This is the most practical way to pay with Bitcoin, because the payment process consists only of holding a mobile device above the merchants NFC reader. As a result, Bitcoin payments are effectively accepted by proxy at over 32 million brick and mortar merchants around the world. The primary purpose of Plutus is to provide incentive for, and enable, the practical day-to-day usage of Bitcoin; ultimately accelerating mass-consumer adoption. The competitive advantage of Plutus, within the mobile payments industry, is the effective utilization of the rapidly expanding Ethereum network. Through a transparent and decentralized network protocol, underwritten by distributed ledger technology (the blockchain), Ethereum allows Plutus to deploy smart contracts to enable secure, peer-to-peer (P2P) exchange of fiat currency and Bitcoin, with the
added benefit of automatic escrow. Using these methods, the Plutus Decentralized Exchange Network (PlutusDEX) of traders convert Bitcoin deposits into a prepaid debit balance that is valid at any contactless point-of-sale (POS) terminal. The philosophy of the application itself is open, inclusive and committed to the network health and widespread usage of Bitcoin. As such, a public trading API will be available, and 3rd party development is encouraged.

Pretty cool! And here’s a brief interview over at medium.com:

What made you join Plutus?

“I have chosen to join Plutus as I see an opportunity to be part of something extraordinary, it is one of the rare startups in the crypto sector that has gone from a white paper to a working product. The company is now well positioned to have a sustainable impact on enabling consumers to utilize their digital currencies for easy, everyday usage. Plutus products and their vision of the future in digital banking is truly innovative, achievable and exciting.”

What will your new role entail?

“My initial role at Plutus will be to help scale the current product, ensuring a smooth transition from beta into the growth stage. This is paramount in order to deliver on the founder’s [Danial Daychopan] vision of a crypto backed payments infrastructure that is used by customers across the world.”

What made you transition towards blockchain?

“Why blockchain? With the history of money and how fiat currencies are dominating the world’s economy in the 20th century, I want to support the use of cryptography to help deliver on Hayek’s dream of private money. I believe it will address many problems and fraud types that current monetary systems suffer from. But DLT is also proving an excellent choice for any system of record or durable medium of exchange, yielding good results compared to legacy applications. With the growing adoption of DLT as a competent technology amongst the C-suite talent base, we will see a rising use-case of DLT across various sectors.”

What is your biggest concern for Plutus?

“Security is always a massive challenge, especially for companies in fintech like Plutus; how such companies strive to deliver top-level security is of great interest to me. I was attracted to Plutus because of their unique offering, a non-custodial crypto-to-fiat exchange [PlutusDEX]; it connects the legacy payments infrastructure with the blockchain and removes the risk of financial losses due to hacks, a frequently occurring issue associated with centralised honeypots. This is a key feature that puts the company in a strong position to deliver the ultimate service in this field, and the project is backed by a very talented team of experts.”

Where do you position the crypto space in the next 5 years?

“With the market cap of cryptocurrencies and their related hype dropping, a better focus can be placed on the technologies behind them rather than the coin valuations themselves. I personally think it was hard for the industry to work on the challenges faced by the emerging technology given its astounding value and the high level of emotions attached at the time, a lot of the sector’s energy was directed towards speculation and profits rather than real solutions that would improve adoption.”

“I believe the recent market decline of crypto has brought the crypto-ecosystem back to its roots, developing truly innovative technologies that will change how consumers view payments and transactions in the long-term. I am pleased to now be well positioned to contribute to the future of this ecosystem.”

Puppet, Git, Docker in DevOps – a simple yet powerful workflow

Posted on November 27, 2018November 29, 2018 by PatrykRz

In this article I’ll briefly describe how I’m managing my code (configs, scripts, etc.) between my workstation and my virtual private server playground. I will try to point out where I’m using simple solutions instead of enterprise-appropriate ones.

To automate the workflow, I am using:

Docker – to run services in sandboxed networks, without their dependencies

Git – for proper version control of my code

Cronie – for simple (cronie is a light weight cron implementation) scheduling (with enterprise alternatives)

Puppet – for file orchestration and integrity monitoring

First of all, I need a code repository with the ability to control versions and to review commits. Git seems the most appropriate as it is easy to configure and is available by default in my Linux distribution (Gentoo). It is also available in more common enterprise Linux choices, like RHEL, SLES or Debian.
It is highly recommended to generate a key pair to use key-based authentication with the Git server, or to be precise – with the ssh daemon running there. Use ssh-keygen to generate the key pair (comes with the openssh package). From there, copy the public key (the one ending in .pub) and place it in Git user’s ~/.ssh/authorized_keys.

Since there’s plenty of guides available about setting up a Git repository, I will not describe this in detail here. What I did briefly was a –bare init for configs.git and scripts.git on the server, while on the client side, I’ve added two remotes over ssh with a custom port:
git remote add ssh://git@rzski.com:9999/path/to/configs.git git remote add ssh://git@rzski.com:9999/path/to/scripts.git
This allowed me to push all my files after aggregating their copies in one directory, and then pull it on my workstation. Now I can edit files locally and push them to the central repository on the server:
% echo "# End of file" >> configs/ntpd/ntp.conf % git add configs/ntpd/ntp.conf % git commit -m "configs: for the purpose of the article" [master 3a26138] configs: for the purpose of the article 1 file changed, 1 insertion(+) % git push configs master Enumerating objects: 16, done. Counting objects: 100% (16/16), done. Delta compression using up to 8 threads Compressing objects: 100% (9/9), done. Writing objects: 100% (11/11), 1.29 KiB | 1.29 MiB/s, done. Total 11 (delta 3), reused 0 (delta 0) To ssh://rzski.com:9999/path/to/configs.git c5858d0..3a26138 master -> configs

Time to set up Puppet to grab the files from the Git repository and push them to chosen environments. For simplicity, I’m using a single environment (production) here. Puppet needs a server (master) and an agent to provide the file orchestration and integrity monitoring functionality. It also needs a connector to grab files from Git and use them as source for modules. There are many ways to integrate Git and Puppet, such as:

Puppet Enterprise (PE) + PE Code Manager (obsoleting or r10k)

Puppet Enterprise (PE) + PE Bolt running scripts

git pull command scheduled to run in the Puppet external mount every minute (or so) by cronie

I went with the last approach, but it is probably least appropriate for enterprise or production environments. For me this was suitable, since I’ve decided to install the puppet-agent (no dependencies) from portage (Gentoo’s package manager), but have the master and the pdk run as Docker containers:
docker pull puppet/puppetserver-standalone docker pull terzom/pdk

Running the Puppet master from a Docker container is very convenient. The images are tiny. I have created a /30 network for just the master and the agent to operate in:
docker network create --internal --subnet=192.168.123.0/30 --gateway 192.168.123.1 puppet-nw docker run --name puppetmstr --hostname puppetmstr --network puppet-nw -d -v /work/puppetlabs:/etc/puppetlabs puppet/puppetserver-standalone

Gladly, the pdk can be run in “disposable” mode (like the ansible container, if you decide to use it) and bind storage to the same config path as the master:
docker run --rm -it -v /work/puppetlabs:/etc/puppetlabs terzom/pdk
Then run pdk to generate templates for a new module. A module is the recommended organizational unit of which files Puppet should control. In most cases people seem to start with NTP as a good, simple example. I’m skipping the interview since I’m not planning to open-source my configs 😉
pdk new module ntp --skip-interview

This generates the schema for the ntp module. Meanwhile, the server needs to be configured to accept connections from the agent (plenty of guides online, I’ll jump over this part) and become authorized to grab files from a clone of the central repo.

To set up the repo:
cd /work/puppetlabs/files; git init; git clone ssh://git@rzski.com:9999/path/to/configs.git chown -R puppet:puppet configs/
GOTCHA: all files and folders in the Puppet config as well as cloned from the Git repository need to match the UID and GID used by the Puppet master in the container. Matching the username and group name is not enough, since these are mapped in /etc/passwd and /etc/group respectively and can have varying octals. Matching them between the host and the container is apparently the recommended approach.

The Puppet master needs to know the location of the central repository pulled from Git. In the below example, I am configuring this directory as a Puppet mount point:
# cat /work/puppetlabs/puppet/fileserver.conf [files] path /etc/puppetlabs/files allow 192.168.123.0/30
[configs] path /etc/puppetlabs/files/configs allow 192.168.123.0/30
Note the paths refer to /etc rather than /work, since that’s how the master running from within the container will see them.
To configure authorizing the server to access this mount (although it should be enabled by default, I’d rather cut it down to just the participants of my puppet-nw network), the following regexp stanzas are needed (use docker attach puppetmstr to hook into the running Puppet master and see how it handles these paths as agents try to connect to it):
# cat /work/puppetlabs/puppet/auth.conf (...) # Authorization for serving files from the Git repo path ~ ^/file_(metadata|content)s?/configs/ method find allow 192.168.123.0/30 path ~ ^/file_(metadata|content)s?/files/configs/ method find allow 192.168.123.0/30 (...)

Now each time I want to link a file in the central repo with a file in production, be it a script, config, or source code, I need to generate a module for it. Inside each such module, I’m defining a config class to manage the actual config file. The rest of the module code has been pre-populated by pdk.
# cat /work/puppetlabs/code/modules/ntp/manifests/config.pp

# ntp::config # # A description of what this class does # # @summary A short summary of the purpose of this class # # @example # include ntp::config class ntp::config { file { "ntp.conf" : path => "/etc/ntp.conf", owner => "root", group => "root", mode => "0644", source => "puppet:///configs/ntpd/ntp.conf", } }
In the above example, the source is defined as Puppet’s fileserver relative location, mount name equals “configs”, and the path is set to ntpd/ntp.conf to match the config file location in the central config repository pulled from Git.

The Puppet master also needs to know on which nodes (servers) it should manage these config files. In this case, the management happens in the production environment, on the VPS hosting the Puppet master container only – rzski.com (hostname is picked from the host’s /etc/hosts):
# cat /work/puppetlabs/code/environments/production/manifests/site.pp node "rzski.com" { include ntp::config }
The above two steps (class definition with the correct path and the site.pp class reference per node) would have to be repeated for every module (set of config files or single files or scripts).

To verify the configs are shared, run the local Puppet binary testing the agent mode:
# puppet agent -t Info: Using configured environment 'production' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Caching catalog for rzski.com Info: Applying configuration version '1543335030' Notice: Applied catalog in 0.05 seconds
And to observe the change commited by appending a comment to the end of the ntp.conf file:
# tail /var/log/puppetlabs/puppet/puppet.log puppet-agent[18560]: Applied catalog in 0.05 seconds puppet-agent[18728]: (/Stage[main]/Ntp::Config/File[ntp.conf]/content) content changed '{md5}96db7670882085ea77ce9b5fa14dc46f' to '{md5}06f8cea8589b23e43dcea88cce5ac8ea

Finally, to “sync” between the Git repo and the Puppet repo, cron can call git pull every minute (as described above, optionally that can land in a small shell script). Either switch user to ‘puppet’ and clone (su – puppet -c “git clone…”, but this requires giving the puppet user a valid shell, which is not ideal), or just pull && chown.

To go further from here, I can also execute a command (using Bolt for agentless approach, or a shell script) to, for example, push my Solidity code into my Docker container running Geth Ethereum node. Let’s say I’ve pushed the file from the workstation and accepted the commit. Once the new version gets cloned into the Puppet repo, run docker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH. Then, to register the ABI and, if required, compile, I can chain another task calling docker exec container_name command and collect the outputs. But that’s material for another article 😉

Gentoo, VPC, Docker and an Ethereum Go node

Posted on August 29, 2018September 4, 2018 by PatrykRz

Blockchain, unlike “cloud computing” is more than a buzz word as it proves to be superior for integral and consistent systems of record in many aspects, such as IT infrastructure footprint and cryptographic security of the data at rest. While there are many projects out there aiming to deliver technology solutions based on blockchain concepts, I believe Ethereum will continue to play a crucial role as an underlying backbone of distributed applications and storage.

Since Ethereum is an open source project, I performed a little exercise of launching a public node. Perhaps I could even try some mining? To make things more difficult, I’ll describe here how I did that on Gentoo running on my VPC (virtual private server hosted by Linode), inside a Docker container. This is in no way an attempt to get rich by mining, since a VPC only operates on a CPU (of which I have 2 cores) and a VGA-compatible stub driver described as:
00:01.0 VGA compatible controller: Device 1234:1111 (rev 02). Quite obviously this cannot run any mining in a serious fashion. All in all, there are some observations gathered throughout the exercise and a few problems solved, which I hope could make someone’s life easier.

I will start with installing the Docker daemon. Surprisingly, there are no software package dependencies. That’s really good, because I want my server to remain minimal.
Packages installed: 389 Packages in system: 43

root@rzski data # equery d docker * These packages depend on docker: root@rzski data #

The Ethereum project team now provides an official Docker image. Once the daemon is installed, it is as easy as pulling the image from the official repo by issuing docker pull ethereum/client-go. The image is only 44MB, which again makes my server satisfied as storage space aint cheap these days.

Before creating a new container by running this iamge, here’s a brief comparison of the 3 running modes geth (that’s the name of the Ethereum node software in Go language) can run with:

–syncmode “full”

–syncmode “fast”

–syncmode “light”

I went with the default “fast” sync mode, but decided to specify resource limits to prevent my container from slowing down other services I’m running from my server. I did the following:
Create a volume so the blockchain data could remain persistent: docker volume create geth_vol
And then launch the image to create a new running container:
docker run -it -m 2G --cpus 1.5 --storage-opt size=20G --name geth_container -v geth_vol ethereum/client-go --syncmode "fast" console

If you’ve run a node previously, you’ll see how naive I’d been by limiting the size of the container to 20 GB of storage space… Etherscan offers a graph showing how much space is actually needed to operate a public Ethereum node. At the time of writing, it is almost 100 GB, which greatly exceeds what I have available on this VPC, therefore I will have to abandon the mining idea and switch to the light mode. Perhaps in the future, once sharding is enabled, this will not be an issue.

Other settings are quite self-explanatory: I gave the container half of my RAM and one and a half CPU cores, which would result in the typical 150% CPU user time in “top”. The CPU limit can also be specified in microseconds of CPU time for finer granularity and updated on demand with the docker container update directive or via Kubernetes. When it comes to limiting the resources, docker depends on cgroups. In my case, not all cgroup options were compiled into the kernel, therefore upon launching geth in interactive mode, I got the following warning: WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted.

This might seem benign but is actually quite tricky. The Linux kernel sends the TERM signal to processes that run out of memory. The OOM state is determined by a mixture of physical and swap memory, therefore once my container consumed all swap, then despite it only allocated half of the physical memory (I had 2 GB left for other services), the container still got killed. The only trace of this killing was a cryptic message in docker ps -a informing that my container stopped with error code 137.

There are two ways to address this: enable cgroups for swap memory in the kernel. The downside of this approach is there’s a performance hit and recompiling the custom kernel is required (in case someone wants to give it a go, steps to achieve are: get the kernel sources, make oldconfig from current settings in config.gz, make menuconfig to enable the missing cgroups swap option, make -j2 to compile the sources and then install grub bootloader, specify the path to the kernel binary, select the boot loader in the VPC dashboard (otherwise it will try to load the default kernel) and that should do the trick).

A slightly faster option is to simply create more swap space and reduce the swappiness. The downside here is naturally the storage space plus, not everyone has the flexibility to move partitions around like in LVM. You can however create a swap file wherever you do have some space at a small performance hit (the calls to pages written to thsi file will go through the FS layer). Here’s how to achieve this:

echo 3 > /proc/sys/vm/drop_caches


 dd if=/dev/zero of=/path/to/swap bs=1024 count=1500000 (for 1.5GB of swap)
 chmod 0600 /path/to/swap
 mkswap /path/to/swap && swapon /path/to/swap

echo 10 > /proc/sys/vm/swappiness

With that, the node and the container in which it is running, should remain safe from the OOM killer. There’s also another option here: use docker’s option to disable containers from being OOM-killed, but that’s silly on a production server. The algorithm in a recent kernel will kill the “worst” process. If it can’t, it will kill something else, which could lead to a disaster. In any case, after following the steps above, it is safe to restart the container with docker start geth_container.

Note: it is super comfortable to use zsh with docker, as it auto-completes docker commands and lists help options as well as locally created container and volume names. Despite it is good practice to custom-name things, you don’t have to.

And that’s all – no configuration is required to start participating on the Ethereum network. Peers will be auto-detected within a minute and synchronisation will happen automatically. You might want to reduce verbosity in the console (debug.verbosity(2)) and check which peers you’re connecting to with admin.peers, and obviously the status of your synchronization with eth.syncing.

Mining is only possible after syncing completely, but if you have the disk space for that, then all you need is creating an account:
private.newAccount(“password”) and geth will automatically use this account’s address to store whatever it managed to mine.

Next-gen infrastructure part 2

Posted on April 24, 2018 by PatrykRz

Around the end of 2016 I wrote a longer article about the state of the IT infrastructure, trying to single out a trend I was observing. I was clearly inspired by Stanislaw Lem’s books as well as my own deep-dive sessions into technology. My conclusions back then were a vision of container or unikernel approach written directly into programmable field arrays by means of combining standard operations in hardware with micro-service architecture, but there was a substantial challenge to be overcome first.

Recently, my friend Matt Dziubinski shared with me an excellent article published under the Electronic Engineering Journal by Kevin Morris that seems to show where things are taking off. Titled “Accelerating Mainstream Services with FPGAs”, the author brings up how Intel acquired Altera, a major player in the FPGA market. It’s been a while since then with many comments suggesting Intel is bringing the hardware infrastructure to the next level. Since then however, the world hasn’t heard about a brain child of the top chip manufacturer fused with hardware accelerators from Altera. How is this merger really driving acceleration in the data center?

The major risk I saw back when I wrote my article was the enterprises’ capability to adopt such technology, given talent scarcity, especially for such low-level tinkering on mass scale. That type of activity is usually reserved for hyperscalers and high frequency trading technology. According to Intel statements and the EEJ article itself, Intel is moving forward in a slightly different direction by launching PACs (news statement), or programmable acceleration cards, which are based on Altera’s Arria 10 GX FPGAs and come with a PCIe interface. That’s right – the smart people at Intel have addressed the challenge by allowing specialized companies tune acceleration cards on per-need basis, which then they can simply insert into their boxes of preference: Dell, HP or Fujitsu. I am guessing integration with blade-type infrastructure is a matter of time as well. This way, enterprises don’t need to hire FPGA programmers with years of Verilog experience anymore. In a consolidating market, that’s a major advantage.

And now, most importantly, a glimpse at the numbers. According to the article on EJJ: In financial risk analysis, there’s an 850% per-symbol algorithm speedup and a greater than 2x simulation time speedup compared with traditional “Spark” implementation. On database acceleration, Intel claims 20X+ faster real-time data analytics, 2x+ traditional data warehousing, and 3x+ storage compression. And that speed-up is without considering the upcoming HBM2 memory and 7nm chip manufacturing process (The FPGAs are on 20nm themselves).

Technology, Vision, Strategy, Innovation

Author: PatrykRz