Gmail with own domain

Gmail seems to be everyone’s favorite web frontend for email. Until recently, it has also had the option to allow sending from custom domains, so the recipient would see “from: yourname@yourdomain.com” for example instead of the not-so-professional name@gmail.com. These days however Google is promoting their G-suite set of products, which make this modification a bit harder if your domain is purchased from an external vendor. Here’s a brief article explaining how to set up your own domain as the default “from” domain in Gmail.

First of all, to avoid reinventing the wheel, I first googled (heh) for existing approaches and found many cases where an external MTA performs authenticated submission to gmail.smtp.com. This is sort of weird, but apparently that’s how Google is fighting spam and email address spoofing. I tried that approach only to find out that in addition to the mandatory authentication (MTA to MTA with passwords?), Google also modifies the header’s “from” field in incoming messages, stamping in the gmail account and moving the previous address to a new header line called “X-Google-Original-From”. As you can imagine, it makes things difficult to manage. In addition to that, Gmail would re-deliver these messages back at where the MX records point, despite their desired configuration was in place, so I had to create a black hole rule to prevent SMTP flooding (discard directive).

For that reason I tried a different approach. Here’s a brief explanation how to set this up using Exim as the MTA (but any other SMTP server would do). In this example, the MX records should point to the external server running the MTA (don’t forget the dot at the end). For outbound mail, Gmail will act as a client (MUA), using Exim as the MTA to authenticate over TLS and send mail out. The Gmail configuration doesn’t change and is explained here. For this to work, authentication data need to be created on the MTA and one more thing: header rewriting at SMTP time, if the domain you’re configuring now isn’t the same as the primary FQDN of the MTA (or, if you allow clients to send with multiple domains from the same server/container). Therefore, to have mail go out with the right FQDN, a rewrite rule like this is required:


begin rewrite
\N^my_name@my_fqdn.com$\N my_name@newdomain.com Sh

As for incoming mail, the task is fairly easy. Instead of authenticating to gmail, I redirect/forward the messages to the original gmail account after accepting them as local. This can be achieved by creating an exception for the default redirect route (which normally reads /etc/aliases for redirection paths), by adding a condition to match the new domain in question. Here’s an example:


begin routers

my_new_redirect:
driver = redirect
domains = newdomain.com
data = ${lookup{$local_part}lsearch{/etc/aliases}}
file_transport = address_file
pipe_transport = address_pipe

Any file could be used instead of /etc/aliases, just make sure the UID/GID with which your MTA runs can read it. The format would be, following this example: “my_name: gmail_name@gmail.com”. And that’s all – it’s SPF-friendly and IMHO cleaner and simpler than the authenticated approach. You might get cursed for rewriting headers by SMTP purists but well, Google does it too.

Artificial Intelligence and Creativity

First, a brief off-topic introduction. A quick search returns the following definition for creativity: “the use of imagination or original ideas to create something; inventiveness”. Not surprisingly, the definition refers to two other words which are similarly awkward if you’re thinking like a machine… Still, we know the process exists, everyone can name creative persons they know directly, so the subject definitely exists. Have we however reached a point where creativity has been distilled down enough to allow coding it into a machine?

Answer to that might depend the understanding of “thinking as a machine”. I wrote that intentionally, because it seems the terms are generally misused. I am used to getting a few sales pitches every month about how their new AI or even machine learning solution will revolutionize my business. I’d usually ask how it works and the answer puts the solutions in one of these pots:
a) simple logic
b) actual AI / machine learning
c) deep learning
As you can guess, most answers end up being a), sometimes a mixture of a) and b).

What’s the difference? Software allowed conditional responses for a very long time. Conditional clauses can be smartly designed and nested to a point where it seems the software (usually the frontend…) is “very smart”, or even “can predict what the user needs”. That’s not AI though.

AI kicks in where an actual algorithm, based on a model, can make assumptions or scoring while no direct reaction has been designed into the software. We have AI everywhere already: every spam filter, every antivirus engine follows uses this approach. For anything more complex though, like aiding in decision making, the computing capacity was so high that only now AI for wider use is becoming popular.

And finally there’s deep learning, which brings AI into a new tier: not only the algorithm goes over data to build rules, it can further based on previous runs build further rules, and rules for such rules, and so on. From computation, that’s usually very expensive, which is why there’s a new blooming market for chips and FPGA solutions that optimize this specific type of load to achieve good timing responses in the learning and decision making process.

I hope this becomes a good introduction to what I actually wanted to share. There’s an excellent paper on Arxiv detailing how tasks submitted to AI/ML solutions were “resolved”, sometimes in the most surprising way. The paper covers 27 anegdotes about how AI found the most surprising answers (to the surprise of the researchers).

From this perspective, wouldn’t you call it pure creativity? And if yes, I’d like to propose a new definition of Creativity: “in the huge lake of data (facts and clues), find new connections that are valuable”? I guess it works for the term “intelligence” as well, if you cross out the word “new”.

Talent Management

I had the privilege to write another piece for the IT WIZ magazine, this time on Talent Management. Wide subject and as always, a struggle to write something new. Something that’d be interesting for everyone. So the idea came up to write “with a twist”. The Polish version can be found here.

Neutrality

It’s been a while, so here’s something more social and less technical again. And I get to join the lodge writing about Linus Torvalds! Let me first write this: any dispute on style or approach to problem solving must be second priority if your main product – Linux – remains one of the most stable and secure software ever produced. This is not just any software. We’re talking about millions of lines of code changed between every point release, provided by tens of thousands of developers worldwide. Software working with a broad number of hardware architectures, “talking to” unnumbered amount of peripherial devices. Finally, a piece of software causing discussions like “If Linus is gone, what will happen to the kernel?”.

I believe it was worthwile summarizing in these few sentences what is the true achievement there. I had to start with that brief summary, otherwise I felt I’d be joining the group that might have been well described by Linus in the following words some time in 2014:

“And there’s a classic term for it in the BSD camps: “bikeshed painting”, which is very much about how random people can feel like they have the ability to discuss superficial issues, because everybody feels that they can give an opinion on the color choice. So issues that are superficial get a lot more noise. Then when it comes to actual hard and deep technical decisions, people (sometimes) realise that they just don’t know enough, and they won’t give that the same kind of mouth-time.”

You can say a lot about Linus and his approach to people or dispute. Dispute will always happen if you work with people of strong character, who have devoted parts of their lives to master an area they then have to prove in production. But not everyone you get to work with will be following this principle. And yet, understanding the root cause of some debates and efficiently not wasting any energy on them is a decent treat of a leader – one that doesn’t often appear in the popular coaching memes. Probably because it is also painful and not entirely neutral.

But can you make big achievements and remain neutral?

Improving the VoIP foreign number solution

It has been a while and I’ve noticed a few issues in my solution (more info here) allowing “assigning” a foreign number to your smart phone. Since solving them was a nice little achievement, I’ll share the description here. It was quite an investigation, too!

The first problem was that the connection keeps timing out from time to time. Not always, but I couldn’t determine a pattern (ah, I love these situations!). In such cases there’s no other choice but to get your hands dirty… and in this case it means some deep packet inspection. Asterisk makes this task fairly easy, so tcpdump was not required. Using sip set debug ip/peer on/off allows finding how my asterisk and the SIP client on my mobile are talking. That’s possible even if you don’t have a good understanding of the session initiation protocol – simply googling it’s workflow is enough to see how it should look. Then all you have to do is comparing the expected flow with what you actually get.

Because of my intermittent issue, I had two sets of SIP debug data – when it works, and when it doesn’t. Comparing the two showed me that when the timeout happens, it is actually the phone that stops responding to SIP INVITEs. Having tried a few other SIP clients on the phone, I had a strong feeling this was not going to be a client issue. So maybe the network? Bingo! As soon as I switched off from my wifi and landed on 4G, the connection worked flawlessly and never failed. As this is weird, I went on to research why that migth happen and found some content explaining how most home routers have a faulty implementation of application layer gateway for SIP. Indeed, it was SIP ALG messing my SIP traffic in failed attempts to “secure my traffic” by inspecting source/destination addresses in the SIP packets. In my case, it would filter out my SIP traffic as soon as the router “forgets” the addresses and mappings, which in my case was 30 seconds from registration (TCP handshake from the mobile to my VPS).

Considering the minimum frequency at which SIP clients can re-register is once every 60 seconds and that my router would time out after about 30s, I was only left with half of the register time working. That’s way below my expectations. I can’t reconfigure my home router (thanks to the ISP), so what do? There’s no way of setting the registry time lower in the client itself without dirty hacks. I ended up doign the following: switch from SIP to IAX2 (Asterisk’s preferred protocol for VoIP), and then I wrote a small patch to my ebuild for Asterisk on Gentoo to define the re-register frequency as 25s, using the following:


$ grep sed /usr/portage/net-misc/asterisk/asterisk-11.25.1.ebuild
sed -i 's:EXPIRE.*60:EXPIRE 25:' "${S}"/channels/iax2.h && ewarn patched IAX2 registry timeout

After re-compiling (making such tweaks is very easy thanks to Gentoo) and restarting Asterisk, even though my client kept asking for 60 second sessions, Asterisk would “demand” a new session (from which all it gets is updating the actual IP address of the smart phone) every 25 seconds anyway. And the timeout problem is almost gone!

Almost, because Asterisk has a security setting called “nat”. It has to be set to “no” regardless if it is SIP or IAX2, which means that when the agents register, Asterisk doesn’t inspect the IP addresses in the header – and those do fail since NAT would translate the IP address of my phone from local LAN to the public IP address of my router. Since I control access to my Asterisk using multiple factors and layers (iptables and user/pass), I consider it safe for disabling.

Improving the efficiency of technical teams

Happy New Year to all the regular readers!

Recently, I had a very interesting chat with a fellow IT leader; we discussed the efficiency of tier 1 technical support and how we usually try to achieve the optimal 80% efficiency mark (80% of the cases to be solved within self-service or the first help desk line). While common techniques are indeed… common (by seeking common denominators in root causes), when it comes to tier 2 and above, there were fewer obvious solutions. And that’s natural, given that complexity becomes a major factor and is implied by the higher tier. So what can leaders do to gain improvement in this area?

I did share one recommendation which works for well for me. It caught some attention therefore I’ll cover it briefly here. The method is called obtaining a minimal working example. It is mostly popular among software developers who are more likely to stumble across a block while working on a complex software project. What do? Involve more brains then, right? Well, of course, but who has the time to go through a litany of complex lines of code, while everyone has their own work to do? Especially if you consider sharing the problem with an open source community, you will need to come prepared. You’ll get no input if the question is presented in a way which might resemble “hey, do my work for me, would you”.

A minimal working example is a complete (ergo: working) piece of code which has to be well crafted before can be presented via collaboration tools to narrow down the scope and yet be able to present the issue. That is required to allow a group to investigate quickly and efficiently. Building an MWE is a method, but even more, it is an exercise for the author. The developer has to be able to collect only the required code, leaving the remainder out to avoid obscuring the root cause of the issue and slowing down the investigation. During the process of building the sample, this method requires considerable effort to be put into evaluation of every step and… yes, about 80% of these get resolved during the preparation of the MWE!

While it is certainly a trick software developers and their managers should consider, it is not limited to programming. System administrators can also benefit from this approach, especially when working on complex configuration files, sometimes including vast and compelx scripting. As tier 2 or 3 support teams gain the ability to build a minimalistic setup with the problem bit captured in it, they surely will also be much more competent and empowered to solve and prevent other technical issues. Naturally, they will also be very efficient when working with the vendor (if eventually required). Building an MWE is also encouraging team work through empathy – after all, the MWE is prepared for someone else to understand and gain insight.

So, that’s one method; please share what works best for you!

Enter the commoditized IT, pushing the boundaries

Commoditization of services might sound like a new trend. After all, all those Ubers and Airbnbs seem like fresh ideas, appearing just recently. We still call them “market disruptors”, right? But probably only those working with IT every day realize that the same trend, with aim for the smallest item performing an unique function, has been present in the IT industry for decades. In this article, I will present the trend as I see it and share my observations. For the casual reader – I promise to keep it as user friendly as possible!

With a certain level of simplification, I believe I can make a statement commoditization in IT started when the “distributed environment”, as the IBM people call it, took off with the approach of creating simple, stand alone computing platforms we refer to now as servers. The attempt to offer services installed on multiple servers, dispersed globally for optimal customer experience and resilience is still popular even with the obvious faults of this approach. For example, servers had a set of resources at their disposal, as the vendor designed them to be applicable for majority of use cases. That usually doesn’t match with what the hosted application really needed. These servers were not as resilient as the mainframe, but at a reasonable price, they allow much faster concept-to-software-service process. These inefficiencies naturally lead to costs that normally wouldn’t have to be implied, but also to indirect outcomes such as developers, especially in the high performance computing sector, having to perform low level optimizations of their software.

The second level of infrastructure commoditization I call the concept of defining key resource areas commonly used by software and creating platforms which allow administrators looking after resource pools. Typically, this is referred to as server virtualization. Imagine your developers work on two projects where the product in each case is an application, but with an entirely different behavior. An application doesn’t store much data, but performs difficult computations – no problem, the admin can spawn (yes, spawn!) a new server with some memory but multiple processors (as long as the app can make use of them). The other application does little math but needs to load huge objects into memory? The admin could now use the memory resources saved on the first project and allocate them here. Running out of resources in the pools? Just insert (hot-swap) another blade of CPUs, RAM or expand the SAN for data storage.

Around the same time, the CPU itself has undergone even further commoditization, we could call it level 3. Multiple solutions have been implemented on processors to improve their efficiency without increasing the clock speed or when further miniaturization of the manufacturing process was not yet achievable. One of the key optimization areas was based on the observation that a lot of potential is lost during the processor cycle itself. As a result, various instructions have been implemented, like SMT for physical cores of a CPU, HT for logical cores, but also VT instructions for the actual virtualization. As a result, we get logical cores in physical cores forming a modern CPU.

Level 4 is a more recent tale, when the definition of SDNs, or “software defined networks” appeared. In simple words, the virtualized stack of standard commodities could be expanded to cover the network as well, which in the internet era is a commodity in itself. The idea is to present a complete set of infrastructure items to meet the development design, which naturally speeds up the whole infrastructural design phase of a project and most importantly, offers a great deal of simplicity to everyone involved in the development. Deployment of new applications of services has never been this easy.

With software defined networking and pools of virtual resources at hand, it is not hard to notice this approach still borrows from the good old “application hosted on a server” concept from what I called “level 1” era. This means each virtual server comes with a lot of overhead, including the operating system and its “wealth” of features. An OS is usually prepared with the idea to just work for as many setups as possible, which then by definition makes it “not the most optimal piece”. Level 5 of commoditization comes to the rescue – containers, among them the most popular container technology – Docker. A single server can run multiple containers, which on the outside look like virtual machines themselves, however they contain only the bare minimum of the OS that the applications hosted in them need to operate. Brilliant, isn’t it? Add to that software to operate multiple containers such as Kubernetes.

So what’s next? Commoditized infrastructure pieces, commoditized operating system allow granular choice of elements to operate applications or services. It doesn’t look like much more can be achieved here. Who can further push the boundaries? The answer is based on the fact that, just like the OS, almost all concepts in popular technology are presented in a way to allow 95% of the use cases to succeed. But what happens, when you’re in that remaining 5%? And who usually ends up there? For the last decade, in my opinion, two areas fall in the 5%: high frequency trading and high performance gaming. Add to them extreme data center owners (aka hyperscale), such as Google or Facebook and you’ll notice they all already have the solution. Software will only be software – it gets loaded into standard memory, operations executed using a standard core. But if the same operations applied to a virtual container happen million times per day on a standard piece of commodity, why not move them to hardware?

FPGA, or field programmable gateway arrays are becoming popular, because they allow just that. The software developer’s work doesn’t have to end as soon as the code is sent to the compiler. The compiler allows the code to be loaded and executed by a typical CPU, which by design is universal and can do a lot of things. Most software however perform a set of standard operations and then something unique to their nature, which gets repeated millions of times. FPGAs allow commoditization of that unique, repeatable activity, which when compiled onto the programmable board, can give it a performance boost of 10-100x at only 20x power increase. There is an immense cost reduction, instead of simply scaling the pool of CPU/MEM resources.

Does it mean application programming has to again become very low-level, make a huge turn around from the era of rapid development? Not at all, there are tools available that make it very easy, for example by utilizing Verilog. Certain standard blocks (DSPs) are pre-installed on the boards, to allow the developers to actually implement only the higher level logic. At the same time, programmable boards are available at reasonable prices and they can be re-programmed, unlike the super-expensive boards produced on demand for HFT trading companies.

FPGAs are the next big thing – if the above combination of breakthroughs did not convince you, take into consideration the battle to purchase Lattice, a vendor of FPGAs only, for $1.3 billion. Consider the rise of Xilinx, specializing in programmable SoCs. Last but not least, Intel acquired Altera about a year ago with the goal to join precisely this promising business.

So what the CIO/CTO can do? While it might not be very easy to find Verilog/VHDL specialists who also are capable developers understanding time-to-market and quality programming concepts, it was a matter of time until other vendors try to fill in the gap. Amazon is already offering EC2 instances with programmable hardware. OVH also offers a similar solution called RunAbove, which at the time of writing this article got completely sold out. Last but not least, there’s also Reconfigure.io, offering a Google Go language compiler optimizing the code developers wrote to be directly installed onto programmable boards.

What’s next? Maybe containers hosting micro services could be moved to FPGAs?

3 questions to…

Here’s something much easier to read hopefully, compared to the previous, quite technical posts. MoneyGram International runs a “3 questions to…” series in the corporate newsletter with the aim to bring the team closer and shed some light on the profiles of senior leadership in the company. My turn to answer 3 questions came last week and I’m glad, because that’s always an opportunity to practice the non-technical writing. I didn’t expect much out of it, but the feedback was exceptionally positive. Well, positive enough to convince me it might be worth sharing with LinkedIn. What do you think?

You’re managing multiple IT Teams at MGI. What skills do you find exceptionally useful in your role?

Empathy, the queen of all skills is probably worth mentioning on the top spot. This is especially important if your teams work from all around the globe and come from a variety of cultures. Leading such teams is often a challenge, even in IT, where information is crisp and precise. Same message is worth conveying in a variety of ways, staying considerate of how others might look at challenges or upcoming projects. Likewise, sometimes substantially different encouragement is needed in order to gain valuable input and uncover true potential of individuals. Being considerate about those differences and yet not losing track of the technical bits and the end goal is a great challenge to have. I believe that empathy is the key to victory in this aspect.

What are the biggest challenges you face at MoneyGram and how do you address them?

I’m sure everyone mentioned time zone difference between Europe and the US, so I’ll skip that and mention something from technology perspective. As such, while we deliver interesting solutions to the market, I believe we can still do better from the perspective of internal and customer facing technology. While staying abreast is an area dedicated to roles outside of IT Operations, I’m putting extra efforts to bring up some ideas and gain resources to new projects.

Based on your professional experience, what career advice would you give to your colleagues at MoneyGram?

I try to give advice to my team (then it’s called “direction”;) ) all the time, so to give one piece of advice now in general is not an easy task without sounding trivial, but let me give it a try. I believe that many people underestimate the power of research. For example, every time I hear or read anything I don’t know or am not sure about, I make sure I write it down. I then have a dedicated time slot, negotiated carefully with my wife, where I follow up on each item on my research list. Sometimes reading is not enough and I end up like building a proof of concept. I never let information get past me without verification – if there’s a chance to broaden knowledge in the area I’m working in, then it is certainly worth following through. Having strong understanding of all aspects results in working less on assumptions, which gives confidence a great boost. And what’s a leader without confidence?

Could SHA-1 and Dirty COW have something in common?

SHA, or Secure Hash Algorithm is a hash function initially announced in 1995 by the United States’ NSA. It become widely popular with appliances for password hashes, certificates and a variety of software. The implementation made it to SSL (and later TLS) collections, SSH, PGP and many more. From the very moment of publication, SHA-1 was contested by cryptography researchers from all over the world, but not to much success. It took 10 years after the initial publication for the first papers to appear claiming weakening SHA-1 to some extent. The papers remained theoretical due to the cost – it would take almost $3mil of CPU power to execute the attack. It took another 10 years for a group of security researchers to further weaken SHA-1 and actually demonstrate an attack by moving the computation to 64 Nvidia GPUs. And why am I writing this? Mainly because this is a story of great success – for a hash function – to remain unbroken for over 2 decades. Still, around 2015 various security authority boards have decided that, by Moore’s Law, within the next 2-3 years, the cost of attacking SHA-1 by brute force was going to be economically viable and generally available. Based on that SHA-1 was to be gradually removed from public use.

On a side note, here’s one very intriguing case with lots of food for thought. Due to the wide adoption of SHA-1, the ban was instated by various decision makers at different times. In 2015, the Browser Forum decided that SSL certificates using SHA-1 can no longer be issued after the 1st of January 2016. Was this enough to stop all CA’s from issuing the popular SHA-1 certificates, sometimes even hard-coded into proprietary software? Dit auditors save the world as it is widely expected by various company boards? Read the full story and brilliant analysis here.

Back to the original subject, I got a bit nostalgic for a reason, and it is not because I hadn’t had steak for a while. Dirty COW is a funny acronym for a rare race condition introduced into the Linux kernel Copy-On-Write code around 2007 (kernel 2.6.22). It is not however a new vulnerability. Linus Torvalds revealed that even though it was discovered long time ago, the race condition leading to privilege escalation was rare enough that it was actually discarded. It was concluded that computation resources required to trigger the condition are not realistic. And almost 10 years later anyone can execute a local exploit and bypass internal security countermeasures companies spend huge sums on. The exploit is easy to obtain, execute and has a very high success rate.

Two completely different cases and security risks. And a complete contrast in strategy. Or the lack of. A decision to obsolete SHA-1 well in advance of vulnerabilities, despite a decent design. And no decision for a ticking bomb. I have read many quotes by Linus on security (or actually against security). Don’t get me wrong, he still is a brilliant lead for the kernel development, but why wouldn’t the Linux Foundation add the missing strategy? Track security bugs that are closed for a reason like this one and react on time? Someone might say “that’s not their job”. If reality requires it, it has to become someone’s job.