No, AI is not Making Engineers 10x as Productive

2025.08.09 | 1300 words | 7 min

Very well written piece I thought. It is nuanced and sufficiently seems to explain away many of the patterns I’ve observed in the AI discourse.

Had my stubborn desire to enjoy coding set me up to be left behind?

I really do enjoy coding; I want to be able to be a software engineer as well as a statistician, and that is motivated wholly by my interest in designing and building systems, programs–code, at some coarse level of abstraction. I do use LLMs, as I think I have made clear (though never for writing for I have far too large of an ego for that), but I typically use them in two broad scenarios. One is vibe coding, where I just care about the (typically, easily verifiable) output and don’t care about the latent representation (the program/theory (Naur, 1985)). This also dovetails into scenario two, characterized by laziness, where I know what to write, broadly, but don’t want to check the documentation again or physically hit the keys to make my theory be represented by code. At a crossroads somewhat is the creation of software in languages I do not know; this is more vibe code-y than lazy, but I think this may preclude, to some degree, my learning of new technologies since I could just ask a LLM and not bother with the hard work linked to learning.

And it was… Fine. Despite claims that AI today is improving at a fever pitch, it felt largely the same as before.

I’ve felt this too. Models are good, don’t get me wrong–it is still magical to see R code which evaluates correctly on the first run just appear at lightning pace–but they aren’t amazing. Let me briefly describe an encounter I had recently which sorta reminded and cemented the mediocrity of models (with respect to writing R, at least) to me.

What LLMs produce is often broken, hallucinated, or below codebase standards.

I started with this plot, which had too much going on in the legend. I needed to replace the whole legend text with just the skewness, and I needed to remove the grey background behind each line in the legend. I have Copilot Pro through the Education program, so I fired the chat interface up and used which model was leading at the time (this was a few weeks ago, so I hardly remember)–let’s go with Sonnet 4–and told it my issue and provided my original pipeline.

It gave me this, but with the background behind every line. I’m drawing that grey rectangle on the plot, so I figure all I needed was a show.legend = FALSE to make that go away, but I was curious if the model would catch that. I must’ve asked it four or five times, giving it the whole pipeline and its output, and telling it exactly what I needed to remove. It gave the weirdest advice, and missed the simple argument which exactly solves my question.

 1read_csv("means.csv") |>
 2  filter(`Sample Size` <= 500) |>
 3  mutate(
 4    Distribution = paste(
 5      str_replace(Distribution, "\\{.*\\}", " "),
 6      round(Skewness, 2)
 7    ),
 8    Distribution = fct(
 9      as.character(Distribution),
10      levels = as.character(unique(Distribution[order(-Skewness)]))
11    )
12  ) |>
13  arrange(Skewness) |>
14  ggplot(aes(x = `Sample Size`, color = Distribution)) +
15  geom_rect(
16    aes(xmin = 0, xmax = Inf, ymin = 0.02, ymax = 0.03),
17    fill = "grey",
18    linewidth = 0,
19    show.legend = FALSE
20  ) +
21  geom_hline(yintercept = 0.025, linetype = "dashed", linewidth = 1) +
22  geom_line(aes(y = `Upper Tail`), linewidth = 1) +
23  geom_line(aes(y = `Lower Tail`), linewidth = 1) +
24  geom_vline(xintercept = 30, linetype = "dashed", linewidth = 0.75) +
25  annotate(
26    "text",
27    x = 40,
28    y = 0.04,
29    label = "n = 30",
30    hjust = 0,
31    size = 5,
32    color = "black"
33  ) +
34  labs(
35    title = "Upper and Lower Tail Weights of Sampling Distributions",
36    x = "Sample Size",
37    y = "Tail Weight"
38  ) +
39  theme_bw() +
40  theme(
41    plot.title = element_text(size = 20),
42    axis.title = element_text(size = 17),
43    axis.text = element_text(size = 12)
44  )

In my experience, AI delivers rare, short bursts of 10-100x productivity. When I have AI write me a custom ESLint rule in a few minutes, which would have taken hours of documentation surfing and tutorials otherwise, that’s a genuine order of magnitude time and effort improvement. Moments like this do happen with AI.

This is where I really resonated with Voege. One, it was nice to see a non polarized take on AI, and two, it felt nice to feel seen. I don’t get those moments all the time, but they do happen, and it is always so cool to see a computer! solve a problem for you, all by itself. You literally just have to ask. However…

The problem is that productivity does not scale. I don’t write more than one ESLint rule per year. This burst of productivity was enabled solely by the fact that I didn’t care about this code and wasn’t going to work to make it readable for the next engineer.

And this is where I find it easy to slink into the role of statistician and eschew the quality of my code. I don’t think that to be universally the wrong decision; most statisticians aren’t building software, they’re writing code, and that code is meant to run once or twice, in a fragile build system. However, I will challenge the field to do better, and to begin with I try to write software not just scripts–or at least as close as I can get without focusing too much on aesthetics and performance.

I wouldn’t be surprised to learn AI helps many engineers do certain tasks 20-50% faster, but the nature of software bottlenecks mean this doesn’t translate to a 20% productivity increase and certainly not a 10x increase.

I’ve found the most help in clearing small humps, the second kind of task I primarily use LLMs for as written earlier, induced by laziness/inertia.

It’s okay to sacrifice some productivity to make work enjoyable. More than okay, it’s essential in our field. If you force yourself to work in a way you hate, you’re just going to burn out. Only so much of coding is writing code, the rest is solving problems, doing system design, reasoning about abstractions, and interfacing with other humans.

As a non-SWE (at least, not by training), I think this is a fantastic description of the field and I wish someone told me this description earlier, just so I had a better/more realistic understanding of what most of the people in the CS industry do.

Do not scold engineers for not using enough tokens.

Goodhart concurs.

I really liked this article on a whole. It’s more than worth a few reads, and I only looked at the IC parts and didn’t really address the sections aimed at managers and non-AI skeptics. I like them too, but I’m tired.

Bibliography

Naur, Peter(1985).Programming as theory building.Microprocessing and Microprogramming, 15(5), 253–261.doi: 10.1016/0165-6074(85)90032-8.

I also need to publish a small unscientific case study I did with R and LLMs, and writing this post has reminded me of that yet again; I need to update some of the analyses too.

I’m also trying to write one post a day. I think that is very ambitious, especially since I’m at a conference and will remain out of town for almost another week, but I think it is a good goal to strive toward. That was inspired in part by Inkhaven, which I want to apply to and probably will forget to, but I would like to take this daily public writing as a habit, at least.

#article

Reply to this post by email