Molecular Complexity in Drug Discovery: The Third Dimension

Whether it is a small-molecule drug or a comparative behemoth like a vaccine or antibody, the medicines of today are designed with a specific target in mind, be it a protein, receptor, DNA or virus. And whether these targets are comprised of amino or nucleic acids, their fundamental similarity is one of unique structural complexity. One would surmise therefore, that a complementary intricacy should be applied to our drugs in order for them to distinguish one protein from another, or one genetic sequence from another. Yet the reality of modern drug discovery is a feeling of apprehension and suspicion towards introducing molecular complexity, instead opting for more two-dimensional, ‘flatter’ compounds which risk attrition from having sub-optimal drug-like properties. This blog entry will examine the arguments for 2D and 3D drugs in modern medicine, and how one may consider introducing complexity in a facile manner.

If you have ever been asked what defines a mere compound of interest as a drug candidate, likely you would answer something akin to the Lipinski Rules of 5 (RO5) definition: molecular weight under 500 Da, LogP under 5, and no more than 5 hydrogen-bond (H-bond) donors and 10 acceptors. It’s a good answer for sure, but since Lipinski’s 1997 epiphany,[1] the success rate for ‘optimized’ lead compounds through the rigmaroles of pre-clinical trials, Phase I, II and III clinical trials and the final signing of the New Drug Application by a willing regulatory body, remains dire at less than 10%.[2] Not exactly efficient for a process that occupies the minds of intelligent, dedicated teams of medicinal chemists, synthesis specialists, protein biologists, biochemists, medical doctors and more, for the better part of a decade and costs upwards of $1billion per registered drug.

So, what’s the problem? Certainly the RO5 are rooted in sound logic; high molecular weight, lipophilic compounds frequently suffer rapid metabolism, and the tiny fraction that escapes the gut and reaches the bloodstream is so insoluble that it precipitates, resulting in total clearance from the body with zero efficacious impact. H-bond donors and acceptors may theoretically form strong interactions with the protein of interest but that will mean nothing if they also interact so readily with water molecules that the drug cannot penetrate the cell membranes necessary to access the bloodstream and reach the protein in the first place. So it’s not that we scientists have been looking in the wrong place for wonder drug utopia, or that Lipinski had it wrong. In fact, Lipinski specified that a lead compound should break no more than two out of the four criteria that he had defined in order for a drug to have suitable oral bioavailability. Perhaps the real criticism of the RO5 is that they’re too simplistic. The definition of ‘drug-like’ has expanded in the intervening years; rotatable bond count, for example, takes note of a compound’s flexibility that may prove vital for accessing deep-rooted binding sites in a substrate that more rigid analogues would struggle to reach.[3]

But even with this broadened lexicon and, in theory, understanding of what makes a lead a successful drug, success in drug discovery continues to elude the best of us. High-throughput screening (HTS), the mass testing of thousands/millions of compound libraries in parallel and so frequently relied upon by medicinal chemists for identifying the first hits in a drug discovery program, routinely returns sub-par or just abysmal starting points; toxicophores galore and rigid, densely aromatic lumps of grease with the aqueous solubility of sand. These may indeed be only starting points, but bad starting points can lead medicinal chemists down meandering paths of mediocrity culminating in attrition, wasted years of effort and a drinking problem.

Drug discovery is not simple, so why should the molecules we ultimately create to treat disease assumedly be of a simple structure? This may cause some readers to run for the hills, but molecular complexity is undeniably a crucial factor in determining clinical success.[4,5] This does not mean necessarily that the synthesis of complex structures should be difficult; a 20-step linear synthesis may well create a drug candidate, but remember that drug syntheses need to be scalable and financially viable. Diversity-oriented synthesis allows for the creation of diverse molecular chemotypes whose structural intricacy mirrors that of natural products; a class of compounds from which so many known drugs are derived. Increased three-dimensional character allows for molecules to easily access multiple vectors of chemical space without substantial increases in molecular weight or lipophilicity. The flexibility of design and structure of complex molecules means that the products of such ingenuity are frequently unique with little-to-no danger of inferring on a rival group’s patent space. Lastly, the pursuit of complex structures is how new chemistries are discovered; new reactions and reagents that make once-difficult reactions simpler, and transformations once thought to be impossible, possible (Table 1). With new in vivo targets being discovered and validated every year as potential therapeutic endeavours, the need for expansion of our collective synthetic chemistry know-how is inevitable, and diversity-oriented synthesis is one such way to achieve that.

Table 1. Arguments for and against the use of diversity-oriented synthesis

However, despite such arguments for the popularization of this philosophy, the trend in medicinal chemistry research, looking at historical oral drugs, modern oral drugs, and the lead compounds of today entering clinical development, sees much too often, the evident over-reliance on reactions deemed safe or easy, such as amide couplings and sp2-sp2 organometallic couplings; transformations that yield conjugated 2-dimensional bricks. Privileged structures, including benzodiazepine-based GABA agonists, 2-arylindole and benzimidazole-based GPCR antagonists, purine-based antivirals, and dihydropyridine-based calcium channel blockers, all feature conjugated pharmacophores considered fundamental and unchangeable (Figure 1A).[6] The RO5 also have increased in value across the board, flirting with and outright breaking the limits of what is considered ‘acceptable’. One study[7] which analysed trends in molecular weight and lipophilicity, for drugs registered pre-1983, drugs registered from 1983 – 2007 and compounds in various stages of development from four major pharmaceutical firms, found that for all patented compounds, the median cLogP was 4.1 and molecular weight 450 Da. By contrast, the median cLogP for drugs registered since 1990 was 3.1 and molecular weight 432 Da. Studies concur that average molecular mass decreases through each stage of clinical development, and molecular complexity, measured as the fraction of sp3-hybridized carbons in the structure (Fsp3), or by the number of stereocentres present, inversely follows an upward trend (Figure 1B).[4] History shows us that ‘complexity is desirable’ and ‘small is beautiful’; is it any wonder that a reduction in the number of launched low-molecular weight oral drugs (< 350 Da) correlates so well with a reduced quantity of new drug launches over the same time period?[7]

Figure 1. A) Structures of a benzodiazepine (1), 2-arylindole (2), purine (3) and dihydropyridine (4) with the privileged structure coloured blue. B) Mean molecular weight and Fsp3 for compounds in various stages of development.


However, the discovery and validation of new targets for drug discovery means that recent medicinal chemistry efforts do not occupy the same chemical space as historical drugs.[7] Medicinal chemists rightly rely on the prevailing strategies and innovations from the past to aid their discovery of hits and leads where no such compound exists, and HTS by definition evaluates the binding of already-known compounds. Modern drug targets are also frequently less ‘druggable’ than the commonly explored targets of the past, such as kinases. The medicinal chemists of today focus on inhibiting protein-protein interactions, HIV proteases and GPCRs, for which larger, more lipophilic hit compounds usually emerge. It’s no secret that higher lipophilicity correlates with increased binding affinity, hence the reluctance to minimize this attribute, in the hope that a lump of grease with complementary bioavailability may one day be discovered. The introduction of stereocentres receives similar pushback whenever it is dared suggested. Assuming that the lead molecule demonstrates enantioselective binding, the synthetic route should be altered so as to produce this enantiomer selectively over the other/s. If a racemate is produced from which only one enantiomer is needed, the overall yield suddenly drops by at least half. The circumvention of this problem through use of chiral starting materials, or catalysts that facilitate formation of one stereoisomer with high enantiomeric excess (ee), usually means a sudden and uncomfortable rise in the costs of production. Even when a chiral lead compound has been optimized sufficiently, regulatory bodies will usually demand rigorous assessment of both enantiomers anyway to confirm that in vivo efficacy is the result of only one isomer, and that the candidate does not racemise, so this means two rounds of pre-clinical and clinical evaluation. If funding is not forthcoming, only the lottery or re-mortgaging the house can save you here.

Given these valid concerns, the introduction of complexity to a lead molecule should ideally be accomplished in as facile a manner as possible. One of the simplest means of introducing increased sp3 character to a compound is by replacing the sp- and sp2-hybridized substituents around a common scaffold, even if that scaffold remains aromatic. Hirata recently demonstrated the effective optimisation of a lead compound designed to inhibit Retinoic Acid-Related Orphan Receptor γ (RORγ) by structural modification influenced primarily by Fsp3 and ligand efficiency (LE), ultimately increasing binding affinity 50-fold, with no time-dependent inhibition of any CYP450s (Figure 2A).[8] Even aromatic scaffolds can be saturated, and those courageous enough to do so sometimes discover a new chemical entity (NCE) with superior drug-like qualities. Collier has described how replacement of a central benzothiazole with a more saturated thiazolopiperidine scaffold resulted in an equipotent phosphoinositide 3-kinase γ (PI3Kγ) inhibitor with lower lipophilicity, a 10-fold increase in aqueous solubility, and one of the highest isozyme selectivities yet observed (Figure 2B).[9]

Figure 2. A) Structural optimization of the substituents in a RORγ inhibitor with a focus on Fsp3. B) Scaffold hopping to produce a superior PI3Kγ inhibitor with higher Fsp3 and lower cLogP.


Returning to the concept of ‘privileged’ structures, many saturated fragments have been identified that hold both rigidity and metabolic stability, such as cyclohexane, as seen in the CCR5 inhibitor Maraviroc (Figure 3A). Between 2012 and 2018, the FDA approved 18 drugs containing cyclopropane.[10] Stepan et al. have also published the use of a bicyclo[1.1.1]pentane scaffold to replace a central 1,4-disubstitued benzene ring, explaining that the increased 3-dimensionality yielded an equipotent γ-secretase inhibitor with reduced ElogD, higher aqueous solubility and Cmax observed in mice, and which crucially directed the substituents along the same vectors as the benzene ring (Figure 3B).[11] This research has been expanded by Levterov et al., who recently highlighted the 2-oxabicyclo[2.1.1]hexane motif as a water-soluble, 3-dimensional alternative to 1,3-disubstituted benzene rings (Figure 3C).[12] Spirocyclics should be considered as second-generation privileged structures, as they usually demonstrate lower lipophilicity than their monocyclic counterparts with the same number of atoms, and consequently are featured in several approved drugs, such as Rolapitant (Figure 3D).

Figure 3. A) Maraviroc. B) Replacement of a 1,4-disubstituted benzene ring with a bicyclo[1.1.1]pentane scaffold. C) Replacement of a 1,3-disubstituted benzene ring with a 2-oxabicyclo[2.1.1]hexane scaffold. D) Rolapitant.


Several well-known transformations are fundamental for any medicinal chemist seeking expanded 3-dimensionality. Pericyclic reactions, including cycloadditions, chelotropic reactions and sigmatropic rearrangements, can rapidly form diverse, complex cyclic structures stereospecifically.[13] Organometallic couplings have also diversified to allow efficient sp2-sp3 C-C bond formation.[14] Alkenes can be used as starting blocks for stereoselective epoxidation and dihydroxylation using the titanium-based catalysts popularized by Sharpless.[15,16] Smith et al. have developed a synthesis of spiroazetidines and spiropyrrolidines in a wide array of ring sizes and substitution patterns[17] and this synthesis repertoire has been broadened by Sveiczer et al. , who reported new methodologies for the synthesis of sp3-rich carbocyclic and heterocyclic spirocycles, including eight novel scaffolds.[18] Hiesinger et al. have summarized the synthetic routes to a wide array of spirocycles, comparing their impact on target potency and selectivity to non-spirocyclic analogues.[19] 

In conclusion, molecular complexity and diversity-oriented synthesis represent two concepts with the potential to revolutionise the currently dreadful success rate of modern drug discovery. Lead compounds with greater 3-dimensionality, as measured by the fraction of sp3-hybridized carbons and the number of chiral centres present, stand a higher chance of successfully passing pre-clinical and clinical trials to become registered drugs. The over-reliance by so many medicinal chemistry teams on flat, heavily aromatic, conjugated systems has been met with a reduction in the number of clinical candidates receiving regulatory approval, due to poor selectivity and high toxicity, disappointing in vivo efficacy, or substandard drug-like properties, such as solubility. Diversity-oriented synthesis enables the creation of architecturally unique and beautiful structures with superior drug-like properties. The means by which complexity can be introduced to a lead compound is varied and has been extensively explored, with the high-yielding syntheses of diverse spirocycles and other second-generation privileged structures now common knowledge. Combined with HTS, the facile introduction of molecular complexity to early hit compounds should allow for the rapid development of optimized leads and increased registration of new drugs.


  • Author:

    Dr. Andrew Shouksmith
    Senior Scientist I

  • References

    [1] Lipinski, C. et al., Adv. Drug Delivery Rev., 1997, 23, 3-25
    [2] Hingorani, A.D. et al., Sci. Rep., 2019, 9, 18911
    [3] Veber, D.F. et al., J. Med. Chem., 2002, 45, 2615-2623
    [4] Lovering, F. et al., J. Med. Chem., 2009, 52, 6752-6756
    [5] Lovering, F., Med. Chem. Commun., 2013, 4, 515-519
    [6] DeSimone, R.W. et al., Comb. Chem. High Throughput Screen., 2004, 7, 473-493
    [7] Leeson, P.D., et al., Nature Rev. Drug Discov., 2007, 6, 881-890
    [8] Hirata, K. et al., ACS Med. Chem. Lett., 2015, 7, 23-27
    [9] Collier, P.N. et al., J. Med. Chem., 2015, 58, 5684-5688
    [10] Wei, W. et al., Drug Discov. Today, 2020, 25, 1839-1845
    [11] Stepan, A.F. et al., J. Med. Chem., 2012, 55, 3414-3424
    [12] Levterov, V.V. et al., Angew. Chem. Int. Ed., 2020, 59, 7161-7167
    [13] Greer, E.M. et al., Annu. Rep. Prog. Chem., Sect. B: Org. Chem, 2012, 108, 251-271
    [14] Manolikakes, G., Comprehensive Organic Synthesis II, Elsevier Ltd., 2014, 3, 392-464
    [15] Kolb, H.C. et al., Chem. Rev., 1994, 94, 2483-2547
    [16] Katsuki, T., Org. React., 1996, 48, 1-299
    [17] Smith, A.C. et al., J. Org. Chem., 2016, 81, 3509-3519
    [18] Sveiczer, A. et al., Org. Lett., 2019, 21, 4600-4604
    [19] Hiesinger, K. et al., J. Med. Chem., 2021, 64, 150-183