When nobody searches for your science

After several months of building the company's website, it had become clear that the keyword research tools on which the search-engine optimisation industry has built its authority were of essentially no use to us. Every variant of the precise scientific name for what the company does returned the same flat, indifferent figure: zero monthly searches. The first instruction in any guide to SEO, find the keywords that have volume and write content that targets them, had no purchase on a niche so specialised that even its own terminology was, by any conventional measurement, invisible.

This is not, in itself, a novel problem. There are categories of expertise (academic specialisms and regulatory subfields among them) for which search engines have always served imperfectly, because the people who use them describe what they want in terms quite unlike those used by the people who supply it. The orthodoxy of the SEO trade, refined over the past two decades into something close to dogma, is to ignore that gap: to find the closest term that does have volume, write for it, accept the dilution. For a company in a saturated category that approach can be made to work, after a fashion. For a company whose entire business resides in the gap, it cannot.

I checked, then, the obvious phrase: the scientific name for the molecular machinery that the company specialises in engineering. Zero. I checked the variants a non-specialist might use, and the vocabulary a pharma executive might stumble towards. Each one returned the same figure. The terms that do have volume, "genetic code expansion", "non-canonical amino acids", describe a category much broader than what the company actually does, and writing for them produces the kind of copy that two hundred biotech firms could publish without anyone noticing the difference. "We do peptide synthesis." "We offer custom solutions." The phrases survive on company websites because the alternative, which is to say something specific, would require an editorial position that most companies never settle on.

The shift, when it came, was a shift in object. I had been thinking about what the science is called. I started, instead, to think about how someone looking for it would describe it. A pharma business-development lead with a peptide drug candidate that requires unnatural amino acid incorporation does not, on the whole, type the molecular machinery into Google. He types something approximate, "peptide synthesis beyond natural amino acids", or "modified peptide production", or "custom amino acid peptide services", none of which is quite the right phrase, all of which describe, with reasonable accuracy, someone who needs what we make. The pages of the site, after several rewrites, came to do two things at once: explain each capability in the language a pharma lead would actually use in a search bar, and connect that explanation to the underlying scientific mechanism, so that the chemists reviewing the page found nothing to which they could object. One register addressed accuracy; the other addressed search.

* * *

Even this, it turned out, was not enough. I had assumed, without thinking about it very carefully, that the work was to be done in the prose; that if the language was right, the engines would find it. They did not, or not reliably. Search engines, on encountering a website, read the copy and try to infer what the site is for. They do this competently for ordinary cases. For a niche so specific that even the terminology is unfamiliar, "competently" means "half right", a standard that, for a company hoping to be found by the small number of people in the world who specifically need what it offers, is no standard at all.

What fixes this, where the prose alone cannot, is structured data. The JSON-LD vocabulary maintained by Schema.org, originally developed in 2011 in collaboration between Google, Bing, Yahoo and Yandex, is a parallel layer of declarations addressed not to the human reader but to the machines parsing the page. Each page on the site now carries it: Organisation, Person, Service, FAQ, JobPosting, each filling in what the prose, by virtue of being prose, cannot fully say. Where a search engine would otherwise have to guess that the company offers a particular kind of incorporation as a service, the Service schema declares it explicitly, with a description, a URL and a category. For a company with mainstream search demand, structured data is a refinement on top of a working surface. For a company whose core terminology has no measurable search volume, it does much more of the work; without it, the engines have very little to go on.

In a similar spirit, and rather more recently, I built an llms.txt file, the format proposed in 2024 by Jeremy Howard at Answer.AI, written so that the language models now mediating a growing share of search traffic can read a structured summary of a site without having to interpret its prose. Every capability, every modality, every key person, set down plainly. Whether the file directly affects discoverability is, at this stage, an open question; the format is too new for confident claims. The exercise itself, however, was useful. It forced me to write down what the company does, free of the marketing softness that creeps in unbidden whenever one is composing for human audiences.

There is a different mistake I might have made, and almost did. The early version of the content plan ran to forty pages, each targeted at some narrow variation of what the company offers, in obedience to the doctrine that says: more pages, more keyword targets, more surface area exposed to search. I scrapped it and built eight substantial pages instead. The page on non-canonical amino acid incorporation, for instance, does not aim at any one keyword. It explains what these molecules are, why standard expression systems cannot use them, how orthogonal translation machinery solves the problem, what it implies for peptide drug design, and where cost and scale currently sit. It has a better chance of surfacing for any of those queries because it actually covers the topic.

Search engines, for reasons that have only become more pronounced as their results have come to incorporate generative language models, reward depth over breadth. For niche biotech, this is a form of luck: you probably have neither the team nor the time to produce fifty pages of thin content, but you can, with some patience, produce a handful that genuinely constitute the best explanation of your science available online. In a field with almost no competition, that bar is reachable.

* * *

In the longer view, three things were sequencing problems. I retrofitted the structured data when it should have been part of the architecture from the first commit; auditing every page after the fact and working backwards from finished prose to JSON-LD took longer, by some margin, than building the schemas in from the start would have. I underestimated internal linking, treating it as cosmetic when it should have been part of the initial design alongside the content and the schemas; the connections between pages are part of the structure search engines use to understand a site. And I did not set out the story first. I built pages around capabilities before deciding what narrative tied them together, and ended up with a collection of accurate pieces that did not build on one another. Each page explained a technology; none of them moved a visitor closer to understanding why the company exists. I am backfilling that authority now, threading the pages into something approaching a coherent argument. It works, though slower than getting the order right would have been.

There are, as far as I know, around a dozen companies in the world working on genetic code expansion at a commercial scale. The number of websites attempting to rank for this content is correspondingly small. In a competitive market, search visibility is a matter of grinding away for years against hundreds of established pages; in niche synthetic biology, a well-structured site with accurate content can become, within months, the reference for a topic. We are not yet there; the site is young; but the trajectory is visible, and the competition is sparse. The constraint was never the competition. It was finding someone who understood the science well enough to write about it accurately and the web well enough to make it findable. That combination is rare in biotech, which is precisely why the opportunity exists.

For niche biotech, in the end, discoverability is a translation problem: explaining the science in the language one's audience uses, and then making that explanation legible to every system that might surface it. The keyword tools will continue to say zero. The site can still be the place people find.

April 2026