Resi Dwi Thawasa

Signing Apple promotional offers with JWS

2025-12-22T00:00:00+00:00

We wanted to give existing subscribers a promotional price. Someone who already subscribed before, has lapsed or is about to, and we want to win them back with a discount. On Apple, you cannot just flip a price. The server has to send Apple a signed token that says “this user is allowed this offer,” and Apple checks the signature before honoring it.

The signature format is JWS. There is a fun continuity here for me: a couple of years back I was on the other side of this, verifying signed callbacks from an ad network. Now I am the one doing the signing.

What JWS is, briefly

JWS stands for JSON Web Signature. The idea is the same as any signature scheme. You have some data, you sign it with a private key, and anyone with the matching public key can confirm the data is genuine and unchanged.

What JWS adds is a standard layout. A JWS has three parts joined by dots:

header.payload.signature

The header says how it was signed and which key was used. The payload is the data you are vouching for. The signature is computed over the header and payload together with your private key. Apple holds the public side (you register your key with them), so they can verify what you send.

For these offers I was on the signing side, which means I needed three things right: the key, the key id, and a nonce.

The key is the private signing key Apple gave us a key id for. The key id goes in the header so Apple knows which of your keys to verify against, which also means you can rotate keys without breaking everything. The nonce is a one-time value tied to the request so the same signed token cannot be replayed later for a different transaction.

The signing itself is short. Most of the work is assembling the exact payload Apple expects and not fumbling the key handling.

header = {
  alg: "ES256",
  kid: APPLE_KEY_ID,
}

payload = {
  productId:    offer.product_id,
  offerId:      offer.id,
  nonce:        SecureRandom.uuid,
  timestamp:    (Time.now.to_f * 1000).to_i,
  # plus the other fields Apple requires
}

jws = JWT.encode(payload, signing_key, "ES256", header)

ES256 means the signature uses elliptic-curve crypto with SHA-256, which is what Apple wants here. The thing that bit me, same as last time on the verifying side, was getting the payload fields and their order exactly as specified. A signature that is cryptographically fine still gets rejected if the payload is not assembled the way the other side expects.

Eligibility is its own problem

Signing the token is only half of it. The other half is deciding who gets the offer at all, and that turned out to need more care than the crypto.

The offer was meant for returning users: people who had subscribed before. A brand-new user who never subscribed should not get the win-back price, because then it is not winning anyone back, it is just a discount for everyone. So before signing anything, I check eligibility on our side: does this user have a subscription history that qualifies, are they in the window we are targeting, have they already used this offer.

The reason this matters is that signing is a statement of trust. When my server signs the token, it is telling Apple “yes, this person qualifies.” If my eligibility check is loose, I am signing off on offers that should never have gone out, and the signature does not save me there. The crypto proves the message came from us. It says nothing about whether we should have sent it.

So the order is: check eligibility first, and only sign if it passes. The signature is the last step, not the gate.

The two sides of a signature

It was a nice thing to notice that the same primitive shows up on both ends of my work, years apart. Back then I held a public key and checked that incoming callbacks were genuine. This time I hold a private key and produce tokens that someone else checks. It is the same crypto, just from the other side.

The practical takeaways are the same on both sides though. Get the signed bytes exactly right, treat the key id as the thing that lets you rotate keys safely, and remember the signature only proves where a message came from. Whether the message should exist at all is a separate decision you still have to make.

Don’t create a new row for every retry

2024-10-15T00:00:00+00:00

We had a recurring-payment flow in our Rails app that, over a few months, quietly created thousands of failed transaction rows for a handful of users. The database kept growing and nobody could explain why the transaction table was so big relative to the number of actual customers. The cause turned out to be a retry loop that made a new row every single time it tried.

The flow

A recurring charge runs on a schedule. We tell the payment provider to charge the saved payment method, and we record a transaction row for it. Usually it succeeds and we are done.

Sometimes the provider returns a 5xx. That is not a “the card was declined” answer, it is a “something is broken on our side, try again later” answer. So we retried. Every few minutes, we tried again.

Here is the part that was wrong. Each retry started from the top of the flow, and the top of the flow created a new transaction row.

def charge_recurring(subscription)
  transaction = subscription.transactions.create!(
    amount: subscription.price,
    status: :pending,
  )

  result = provider.charge(transaction)
  transaction.update!(status: result.success? ? :succeeded : :failed)
end

So when the provider was having a bad hour and returning 5xx for one user, that user got a new failed row every few minutes. A few hours of that is dozens of rows. A provider outage stretched over a day, across a few unlucky users, is thousands.

Why this was bad beyond the row count

The bloat was the visible problem, but it pointed at a deeper one. We were treating each retry as a brand-new attempt to charge, when really it was the same charge being retried. The state of “this user owes us this month’s payment” lived in many rows instead of one, so it was hard to answer simple questions. Has this user been charged this month? Well, there are forty rows, most failed, and you have to reason about all of them to know.

It also meant our retries were not idempotent. An idempotent operation is one you can run many times and get the same effect as running it once. Creating a row on every attempt is the opposite: each run leaves another mark.

The fix

Two changes.

First, stop creating a row per retry. Create the transaction once, in a processing state, and retry against that same row. If the charge fails with a retryable error, leave the row in processing and try again later. Only move it to failed when we have actually given up, and to succeeded when it goes through.

def charge_recurring(subscription)
  transaction = subscription.transactions.find_or_create_by!(
    period: subscription.current_period,
  ) do |t|
    t.amount = subscription.price
    t.status = :processing
  end

  return if transaction.succeeded?

  result = provider.charge(transaction)
  if result.success?
    transaction.update!(status: :succeeded)
  elsif result.retryable?
    transaction.touch(:last_attempted_at) # stays processing
  else
    transaction.update!(status: :failed)
  end
end

The find_or_create_by! keyed on the billing period is what keeps it to one row. The first run creates it. Every retry finds the same one. There is now exactly one row that represents “this user’s charge for this period,” and its status tells you where it stands.

Second, slow the retries down. Trying again every few minutes during a provider outage is pointless and just hammers them. We moved to a backoff: wait a few minutes, then longer, then longer still, up to a cap, and give up after some number of attempts. The provider gets room to recover and we are not generating noise.

The lesson

Retries are supposed to repeat an attempt, not multiply state. If each retry leaves behind a new row or some other side effect, then the retry is not really retrying the same operation, it is doing a fresh one each time.

The fix is to have a stable thing the retry acts on (one row, found by a natural key like the billing period) and to make the operation safe to run again. Pair that with a sane backoff so a bad hour at the provider does not turn into a wall of garbage in your database.

Moving from VS Code to Neovim

2024-08-05T00:00:00+00:00

I switched my main editor from VS Code to Neovim about two months ago, and I think it is going to stick.

This is not a “VS Code is bad” post. VS Code is great. I used it for years and it rarely got in my way. The honest reason I switched is that I was curious, I already live in the terminal most of the day (tmux, a bunch of panes), and jumping out to a separate window started to feel like a small tax I paid a hundred times a day. I wanted to try staying in one place.

The first week was rough

I want to be honest about this part, because nobody warned me enough. The first week I was slow. Really slow. I knew the Vim motions in theory, and I had used the Vim extension in VS Code before, but configuring a whole editor from nothing is a different thing. I kept a sticky note of keymaps next to my keyboard. I almost went back twice, both times around 5pm when I just wanted to finish a task and not fight my tools.

It got better in the second week. By the third, the motions were in my fingers and I stopped thinking about them.

What my setup looks like

I went with a config built from scratch instead of a ready-made distro, because the whole point for me was to understand every line. I use lazy.nvim to manage plugins. The pieces I reach for every day:

telescope for fuzzy finding and grepping the project, backed by ripgrep and fzf
treesitter for syntax and better text objects
nvim-lspconfig with mason for language servers, and nvim-cmp for completion
conform and nvim-lint for format-on-save and linting
which-key, so when I forget a keymap the menu reminds me
vim-rails, vim-projectionist and fugitive, since I write a lot of Ruby and Go
vim-tmux-navigator, so the same Ctrl-h/j/k/l moves between vim splits and tmux panes

The theme is Catppuccin Mocha, the same one I use almost everywhere, including this blog. Leader is the space bar. I spent an embarrassing number of hours getting the Tailwind language server to behave inside Rails form helpers, which is exactly the kind of yak-shaving you sign up for when you go down this road.

Is it actually better?

For some things, clearly yes. Moving around a codebase with telescope and treesitter motions is faster than I was before. Macros and the dot command save me real time on repetitive edits. And there is something nice about an editor that opens instantly and barely uses any memory.

For other things, not yet. Debugging is still a smoother experience for me in VS Code. And every few weeks I lose an evening to configuring something that just worked before. That is part of the deal.

The biggest thing is harder to measure. The setup is mine now. When something annoys me I can fix it, because I wrote the config and I understand it. That feeling is worth a lot.

If you are thinking about trying it

A few things I would tell myself two months ago:

Do not configure everything on day one. Start small, add things when you actually feel the need.
Learn the plain motions before the plugins. The plugins are nice, but the motions are the real reason to be here.
Keep VS Code installed. Having a fallback for the bad days takes the pressure off.
Give it a month before you decide. The first week is not representative.

Two months in, I am not going back. I also do not think I will ever be “done” with the config, and that is part of why I like it.

Verifying rewarded-ad callbacks server-side

2024-04-09T00:00:00+00:00

Rewarded ads are the ones where a user watches a video and gets something in return. Finish the ad, get some energy, or coins, or an extra life. The flow is nice for everyone when it works: the user gets a reward, we get ad revenue.

The part that is easy to get wrong is how you decide the user actually earned the reward.

The naive version

When the ad finishes, the ad network tells you about it with a callback. Your server gets a request that says, in effect, “user X finished the ad, give them their reward.” You look up user X and add the coins.

If you stop there, you have built a coin printer. Anyone who can figure out the shape of that callback can send it themselves. They do not need to watch an ad. They just hit your endpoint with the right user id and farm rewards all day. And these endpoints are not secret. They get discovered.

So you cannot trust the callback just because it arrived. You have to verify that it really came from the ad network and was not forged.

Why client-side checks do not help

The first instinct is to check things on the client. Confirm in the app that the ad really played, then call your own server.

That does not work, and the reason is simple: the client is fully under the attacker’s control. Someone running a modified app, or just replaying requests with a proxy, can make the client say whatever they want. Any check that runs on a device you do not control is a suggestion, not a guarantee. The decision about whether a reward is real has to happen on the server, using something the client cannot fake.

Signed server-to-server callbacks

The pattern that does work is a signed callback delivered server to server.

The ad network sends the callback straight to your server, not through the app. With each callback it includes a signature: a value computed over the callback contents using a private key that only the ad network holds. You hold the matching public key. You recompute over the contents and check that the signature verifies against that public key.

If it verifies, the callback genuinely came from the ad network and nobody changed it on the way. If it does not, you drop it. An attacker cannot produce a valid signature because they do not have the private key, and that is the whole point of public-key signatures.

A real detail here is key rotation. Networks rotate their signing keys, so the callback also carries a key id. You keep a small set of public keys, look up the one named by the key id, and verify against that. When the network rolls to a new key, the key id changes and you have already fetched the new public key, so verification keeps working without a scramble.

In Go the verify step is small. The hard work is keeping the right public keys around and getting the signed bytes exactly right.

func verify(pub *ecdsa.PublicKey, signedBytes, sig []byte) bool {
    hash := sha256.Sum256(signedBytes)
    return ecdsa.VerifyASN1(pub, hash[:], sig)
}

The thing to be careful about is signedBytes. You have to sign and verify over the exact same bytes in the exact same order the network specified. If you reorder query parameters or re-encode something, the hash changes and a perfectly valid callback fails to verify. I lost time to this: my signature check kept failing not because the signature was bad but because I was building the signed string slightly differently than the network did.

What I would tell myself starting out

Treat every incoming callback as hostile until the signature says otherwise. The reward is money, and anything that grants money on an unverified request will get abused.

And keep the verification on the server with a key the client never sees. The client can be helpful for the experience, but it cannot be the thing that decides whether a reward is real.

Refactoring a Go service to 90% coverage

2023-12-19T00:00:00+00:00

I spent a chunk of last quarter refactoring a Go service that backs our ads. It had grown the way these services do: one big package, functions that did three things each, and almost no tests. I wanted to make it easier to change without being scared every time. Along the way the test coverage went from somewhere low to about 90%, and the tests caught more real bugs than I expected.

Why I bothered with tests at all

The honest answer is that I did not trust myself to refactor it safely without them. The service handled real traffic and real money. If I broke something quietly, I would not find out until it showed up in a graph.

So before pulling functions apart, I wrote tests around the current behavior. Not perfect tests, just enough to pin down what the code did right now. Then I refactored, and if a test went red I knew the refactor changed behavior, not just shape. This is the loose version of TDD: I am not writing tests for code that does not exist yet, I am writing them to lock down code I am about to move around.

What the tests caught

Three things stand out.

First, nil and empty edge cases. A function took a slice of ad slots and picked the best one. When the slice was empty, it indexed into it and panicked. In production this almost never happened because there was usually at least one slot, but “almost never” is not never. The test that passed an empty slice failed loudly, and the fix was two lines.

Second, a config path nobody had ever tested. The service read a config value that switched between two pricing modes. One branch was exercised constantly. The other branch was only used for a specific kind of campaign, and it had a bug that had probably been there for a year. Nobody noticed because no test and very little traffic ever hit it. Writing a test for that branch surfaced it immediately.

Third, a crashloop right after a refactor. I had split a struct into two, and one of the new constructors left a field as its zero value when it should have been initialized. The service booted fine in most environments but crashlooped in one because that field was dereferenced during startup. A startup test that just constructed the service and ran its init caught it before it ever shipped. Without the test it would have been a deploy, a crashloop, a rollback, and an afternoon of confusion.

How tests changed the refactor

The thing I did not expect was how much the tests changed the way I worked, not just the result.

With a test suite I trusted, I stopped being careful in the cautious, slow way and started being careful in a faster way. I could rename things, move functions between files, change a signature, and run the tests. Green meant keep going. Red meant look closer. I did dozens of small refactors in a row that I would never have risked on the old untested code, because each one was cheap to verify.

That changed the pace. Refactoring untested code feels like walking on ice. Refactoring code with good coverage feels like normal walking.

func TestPickSlot_Empty(t *testing.T) {
    got, err := PickSlot(nil)
    if err == nil {
        t.Fatal("expected error for empty slots, got nil")
    }
    if got != nil {
        t.Fatalf("expected nil slot, got %v", got)
    }
}

A test that boring is the kind that catches the panic at 2am.

The payoff during peak traffic

The payoff showed up during a high-traffic stretch. The refactored service ran through it without the kind of small errors we used to see. Not because the code was magically better, but because the obvious holes (the nil case, the untested branch, the bad constructor) were already closed before traffic ever reached them.

I am not going to claim 90% is some target everyone should hit. The number is not the point. The point is that writing tests around behavior before changing it let me change it a lot, and the bugs it caught were the kind that hide in the paths you never look at.

Autoscaling workers on queue depth, not CPU

2023-11-21T00:00:00+00:00

We had a set of background workers that read messages off a queue and processed them. The queue is Google Pub/Sub. The workers run on Kubernetes, and they were autoscaled by an HPA that watched CPU. On paper that sounds reasonable. In practice it scaled at the wrong times, and sometimes not at all.

The symptom was a growing backlog. Messages would pile up, the lag would climb into the tens of minutes, and the workers would just sit there at low CPU not scaling up. By the time anyone noticed, we were way behind.

A quick word on HPA

The Horizontal Pod Autoscaler watches a metric and adds or removes pods to keep that metric near a target. The common setup is CPU. You say “keep average CPU around 60%”, and if CPU goes above that the HPA adds pods, if it drops below it removes them.

CPU works well for request-serving services. More traffic means more CPU, more CPU means more pods, and the new pods take real load off the old ones. The signal and the work line up.

Why CPU is the wrong signal here

Queue-draining work breaks that link.

A worker pulling from Pub/Sub spends a lot of its time waiting. It waits on the network to pull a message, waits on a database call, waits on some downstream API. While it waits it uses almost no CPU. So you can have a huge backlog and workers that are busy but not CPU-busy. The HPA looks at CPU, sees it sitting at 20%, and concludes everything is fine. It does not scale up. Meanwhile the backlog keeps growing.

The reverse happens too. A burst of cheap messages can spike CPU for a moment and trigger scale-up even though there is barely any backlog, so you scale on noise.

The core issue is that CPU does not measure the thing we actually care about. We care about how far behind we are. CPU is a poor proxy for that.

Scaling on backlog instead

What we actually want is to scale on queue depth: the number of messages that have been published but not yet acknowledged. That tells us how much work is waiting. If it climbs, we are falling behind and need more workers. If it sits near zero, we have enough.

Pub/Sub exposes this. The metric is the number of undelivered (unacked) messages on a subscription. Kubernetes can scale on it as an external metric, meaning a metric that lives outside the cluster. You run an adapter that reads the value from Cloud Monitoring and feeds it to the HPA.

The HPA config then targets a per-pod backlog instead of CPU. The shape is something like this:

metrics:
  - type: External
    external:
      metric:
        name: pubsub.googleapis.com|subscription|num_undelivered_messages
        selector:
          matchLabels:
            resource.labels.subscription_id: my-subscription
      target:
        type: AverageValue
        averageValue: "100"

The way to read averageValue: 100 is “aim for about 100 undelivered messages per pod.” If there are 1000 messages waiting, the HPA wants roughly 10 pods. If there are 50, one pod is plenty. As the backlog grows, pod count grows with it, and the new pods pull messages off the same subscription so the backlog actually comes down.

You pick the target number based on how fast one pod drains messages and how much lag you can tolerate. A smaller number scales up more aggressively and keeps lag low but uses more pods. We tuned it by watching the lag during a normal busy period and adjusting.

What changed

After the switch, scale-up happened when the backlog grew, not when CPU happened to twitch. During a spike the workers fanned out, drained the queue, and scaled back down. The lag stopped being a thing we got paged about.

The lesson I took from it: autoscale on the metric that describes the work, not the metric that is easiest to grab. For a worker draining a queue, that metric is the queue depth.

The worker that kept eating its own memory

2023-10-10T00:00:00+00:00

For a few weeks we had a background worker that slowly ate memory until the pod got killed, restarted, and did it all over again. It ran a heavy detection job on video files, so the first guess was that it simply needed more memory. We raised the limit. It still died, just took longer to get there. I spent the better part of a week on this before I actually understood it.

The worker is a Go service. The shape of the job is simple. It picks up a task, downloads a video file to local disk, runs detection on it, writes the result somewhere, and then deletes the temp file. Nothing fancy.

What I assumed

I assumed Go was holding onto memory. Maybe a slice that kept growing, maybe a buffer I forgot to reset between files. The detection step reads a lot of bytes, so it felt plausible that I was loading whole files into memory and not letting them go.

So I went looking there first. I added pprof, took a few heap profiles, and stared at them. The heap looked fine. It went up while a job ran and came back down after. No obvious leak in the Go sense. That was the first dead end, and it cost me a couple of days.

What was actually wrong

The thing I missed was on disk, not in the heap.

The cleanup that deleted the temp file ran at the end of the function, after detection finished. Like this, roughly:

func process(task Task) error {
    path, err := download(task.URL)
    if err != nil {
        return err
    }

    result, err := detect(path)
    if err != nil {
        return err // file never deleted
    }

    if err := save(result); err != nil {
        return err // file never deleted
    }

    os.Remove(path)
    return nil
}

See the problem. When detection failed partway through (and on some files it did fail), the function returned early. The os.Remove at the bottom never ran. The temp file stayed on disk.

Every failed job left a file behind. Over hours, those files piled up. The pod has a small writable layer, and once that filled we started getting no space left on device. The memory side made it worse because some of those files were being mapped and the page cache filled up too, so the pod looked like it was running out of memory even though my Go heap was healthy.

So it was not one problem. It was leftover state on disk that showed up as both a disk error and what looked like a memory problem. Raising the memory limit only delayed the moment the disk filled.

The fix

Make the cleanup run no matter how the function exits. In Go that is what defer is for.

func process(task Task) error {
    path, err := download(task.URL)
    if err != nil {
        return err
    }
    defer os.Remove(path)

    result, err := detect(path)
    if err != nil {
        return err
    }

    return save(result)
}

Now the file is deleted whether detection succeeds, fails, or panics. I moved the defer to right after the download so there is no path where a file is created but not scheduled for removal.

I also added a small startup step that clears the temp directory when the worker boots, so a pod that died mid-job does not start its next life with old junk already sitting there.

After that the memory graph went flat. No more slow climb, no more restarts.

What I would check first next time

OOM is not always “add memory.” Before touching the limit I should have looked at what the pod was actually doing on disk. df -h inside the pod would have shown the disk filling in about thirty seconds. A heap profile is the right tool when the heap is the problem, but I reached for it because it was the tool I knew, not because the evidence pointed there.

The other lesson is plainer. Any code that creates a temp file should schedule its removal on the next line, before the work that might fail. Cleanup that only runs on the happy path is not cleanup.

Pairing and TDD changed how I work

2022-09-20T00:00:00+00:00

The team I joined practices Extreme Programming, or XP. In practice, for me, that has mostly meant two things: pair programming and test-driven development. I had done neither in any serious way before. A few months in, both have changed how I work, and I have feelings about it.

Pairing, the hard parts first

I will start with the honest part. Pairing was exhausting at the beginning.

You sit with another person and write code together, one keyboard, two brains. There is no zoning out. There is no quietly googling something for ten minutes while you figure out what you are doing. You have to think out loud, constantly, and that is tiring when you are not used to it.

It also felt slow. Two people, one task, surely that is half the speed? That was my math at first.

And you cannot hide. If I do not understand something, my pair knows immediately, because I have to explain my reasoning as I go and it falls apart out loud. As the newer person on the team, that was uncomfortable. I wanted to look like I knew things.

Why I came around

A few things changed my mind.

It caught bugs early. Not in code review days later, but in the moment, before the bad idea even got typed. My pair would say “wait, what about an empty list?” and we would handle it right then.

It spread knowledge fast. I learned the codebase, the tools, the team’s habits, much faster than I would have alone. And it goes both ways, so I was not just taking.

And explaining my thinking out loud made me a clearer thinker. When you have to say why you are doing something, you notice when the why is weak. A lot of my bad ideas died the moment I tried to say them to another person.

It is still tiring. I just think the tiredness buys something now.

TDD

Test-driven development was the other shift. The loop is red, green, refactor: write a failing test, write the smallest code to make it pass, then clean it up.

Writing the test first felt backwards. How do I test something that does not exist yet? But that is sort of the point. Writing the test makes you decide what the thing should do before you build it. The test becomes a design tool, not just a check at the end.

The part I did not expect to value so much is the confidence to refactor. With a wall of tests behind me, I can rip apart a piece of code and rearrange it, and if I break something the tests tell me right away. Before, I was scared to touch working code. Now I am much less scared.

Mixed feelings, still

I do not want to make this sound like I converted and now I love everything. Pairing all day still drains me. Sometimes I miss working alone with my own thoughts. And TDD slows the start of a task, even if it speeds up the end.

But I write better code now, and I understand it better, and I trust it more. So I am not arguing with the method anymore. I am just learning to do it with less resistance.

What I took from Design Patterns in Ruby

2022-07-11T00:00:00+00:00

I read “Design Patterns in Ruby” recently. It walks through the classic Gang of Four patterns but written the way you would actually write them in Ruby, not Java translated word for word. I also leaned on a nice summary by davidgf while reading.

I am not going to list all the patterns. Half of them I will forget the names of by next month anyway. What stuck with me were a few guiding ideas that sit underneath all of them.

The ideas underneath

Separate what changes from what stays the same. Most of these patterns exist to put a wall between the part of your code that keeps changing and the part that does not. Find the thing that varies, isolate it, and the rest of the code stops caring about it.

Program to an interface, not an implementation. Depend on what an object can do, not on what class it is. This makes it easier to swap one thing for another later without rewriting everything around it.

Prefer composition over inheritance. Instead of building tall inheritance trees, give an object the pieces it needs by handing them to it. I had heard this before but the book made it click why: inheritance ties you down hard, composition leaves you free to rearrange.

Delegate. When an object gets asked to do something that is not really its job, hand it off to another object that owns that job. Keep responsibilities where they belong.

You ain’t gonna need it (YAGNI). Do not build the flexible, future-proof version until you actually have the future problem. Most of the time the future you imagined never arrives.

My main takeaway

The thing I keep coming back to is this: patterns are tools, not goals.

When you first learn patterns there is a strong temptation to use them. You see a factory-shaped hole everywhere. You wrap things in strategies that have exactly one strategy. I have done this. It feels smart and it makes the code worse.

The book made me more careful. Only reach for a pattern when you have the actual problem it solves, right now, in front of you. If you apply it because it might be useful one day, you have just over-engineered the thing and added indirection nobody asked for. That ties back to YAGNI.

The Ruby part

The book also points out things Ruby gives you that change how some patterns look, or make them unnecessary.

You can build small DSLs (domain specific languages) because the syntax is flexible enough to read almost like configuration. Metaprogramming lets objects define methods on the fly, so some patterns that exist to work around rigid languages just melt away. And convention over configuration, which Rails leans on heavily, removes a lot of the wiring that other languages need patterns for.

So a few patterns from the Java world feel like solutions to problems Ruby does not have.

Overall a good read. Not because I will recite the patterns, but because it made me ask “do I actually need this?” before adding structure. That question alone was worth it.

Learning Go after years of dynamic languages

2022-04-19T00:00:00+00:00

I started writing Go about six months ago. At work, the backend ad services are in Go, so I did not really have a choice, and I am glad about that now.

My background is dynamic languages. Years of PHP, and more recently Ruby. So Go was a different way of thinking. This is not a tutorial, just some notes on what clicked and what annoyed me after half a year of it.

What clicked

Explicit error handling. In PHP and Ruby I throw and rescue exceptions, and errors can come flying out of anywhere. In Go, a function returns an error as a value, and you handle it right there. At first this felt primitive. Now I find it easier to reason about, because I can see exactly where things can go wrong by reading top to bottom.

Zero values. Every type has a sensible default. A string is "", an int is 0, a bool is false, a map or slice starts empty. So you do not get the “undefined variable” surprises I was used to. Things are always something.

Goroutines. Starting a concurrent task is just go doSomething(). Compared to the contortions I remember from doing concurrency elsewhere, this felt almost too easy. Channels took me a bit longer to get comfortable with, but the basic model is nice.

Fast compiles. The compiler is quick, so the feedback loop is tight. And it catches a lot of my mistakes before the code ever runs, which I did not have with PHP.

Small language. There is not much syntax to learn. I read through most of the language in a weekend. After Ruby, where there are five ways to do everything, having one obvious way is restful.

What annoyed me

The if err != nil thing. Yes, I just praised explicit errors. I also got tired of typing this:

result, err := doSomething()
if err != nil {
    return nil, err
}

over and over. The reasoning is good. The repetition still wears on you after the hundredth time.

Missing conveniences. Coming from Ruby I kept reaching for things that are not there. No map or select on a slice (you write the loop yourself). No nice one-liners for common collection operations. You write more code to do simple things. The code is clear, but it is more typing.

Where I landed

I do not love everything about Go, and I do not need to. What I have come to like is that the code is boring to read, in a good way. There are fewer clever tricks, so when I open a file I wrote three months ago I can actually follow it.

After years of languages that let me be clever, Go mostly stops me. I did not expect to appreciate that.