Computational QSAR study of novel 2-aminothiazol-4(5H)-one derivatives as 11β‐HSD1 inhibitors

QSAR analysis was used to study the structure–activity relationship of fifty-six 2-aminothiazol-4(5H)-one derivatives as 11β‐HSD1 inhibitors. 3D representations of each structure were created with Gauss View 6 and optimized with the DFT method b3lyp/6–311 + + g(d,p) available in the Gaussian 16 software. Examples of three-dimensional structures of molecules with defined geometry are presented in Fig. 2.

Fig. 2figure 2

Geometrically optimized structures of selected 2-aminothiazol-4(5H)-one analogs: a 1c; b 2e; c 3g; d 4h; e 5i; f 6j

During the molecular modeling study, over 5000 molecular descriptors were calculated for geometrically optimized structures by Dragon Software. To improve the predictive performance of QSAR models, the 10 best descriptors were selected from a large pool of candidates. Selected descriptors belonged to the following descriptor blocks: Galvez topological charge indices, 2D autocorrelations, 2D matrix-based descriptors, GETAWAY descriptors, Burden eigenvalues, 3D-MoRSE descriptors, and RDF descriptors (Table 1 and Table 3). These descriptors were subsequently employed as independent variables to build multiple different models using artificial neural networks as a post-processing algorithm and, in turn, to forecast the 11β‐HSD1 inhibitory activity of studied and 39 newly designed compounds.

Table 1 Values of molecular descriptors selected for model building for the investigated 2-aminothiazol-4(5H)-one derivatives

The Automated Network Search (ANS) algorithm was used to train approximately five hundred neural networks with varying characteristics. After evaluating the training, test, and validation performances, including correlation coefficient and sum of squares error, a multilayer perceptron (MLP) 10-11-1 network with the ID 344 was chosen as the best model for representing the connections between specific descriptors and 11β‐HSD1-inhibitory activity, addressing the regression problem under study. Many different algorithms and activation functions were used during network creation. The predictive network used a Broyden–Fletcher–Goldfarb–Shanno (BFGS) learning algorithm and exponential functions for both the hidden and output layers. The network consists of ten artificial neurons in the input layer, eleven artificial neurons in the hidden layer, and one neuron in the output layer. The ’10-11-1 model’ refers to the architecture of the best-performing artificial neural network employed in this study, consisting of 10 input neurons, 11 neurons in the hidden layer, and a single output neuron representing predicted activity.

The predictive QSAR MLP 10-11-1 model is characterized by a high correlation between the experimental logarithm of the percentage of 11β‐HSD1 inhibition (log I11β‐HSD1) value and the value predicted by the network. The correlation coefficients are 0.99 for training, 0.95 for the test set, and 0.93 for validation. The sum of squares error between observed and predicted values was 0.002 for the training, 0.03 for the test, and 0.06 for the validation sample. Validation procedure was performed as described by Roy et al. [24] and confirmed that the MLP 10-11-1 model may be implemented to predict the 11β‐HSD1 inhibitory activity of novel compounds based on a pseudothiohydantoin scaffold. The most commonly used quality metrics in assessing the reliability and precision of predictions of QSAR models [25] are presented in Table 2.

Table 2 Selected validation parameters of the predictive MPL10-11-1 model

A graphical representation comparing predicted values of log I11β‐HSD1 with experimentally determined values confirms that the network predicts the data relatively well (Fig. 3). A positive relationship with a strong strength is observed on this scatter plot.

Fig. 3figure 3

A scatter plot of experimental versus predicted log I11β‐HSD1 values. Correlation coefficient values: training set R = 0.99; test set R = 0.95; and validation set R = 0.93

To assure high accuracy of predictions and prevent overfitting, ten key descriptors were first selected to use in building the ANN model and used as input variables (Table 3). Moreover, the model was validated on a separate test and validation set. The high Q2 value (0.9944) confirms the generalizability of the model. Four of these ten descriptors were 3D geometrical descriptors, while the remaining six were representatives of 2D topological descriptors. This designates that two- and three-dimensional molecular structures play a crucial role in the biological activity of the investigated compounds. The regression analysis using artificial neural network algorithms and subsequent sensitivity analysis showed the significance of specific descriptors for the 11β‐HSD1 inhibitory activity of the investigated 2-aminothiazol-4(5H)-one derivatives. This analysis identified the molecular features that have the most significant impact on the activity of the compounds being studied. The ranking of descriptors used in the 10-11-1 MLP model, listing them in order of importance along with their dimensionality, block, and definition, is presented in Table 3. Figure 4 illustrates the relative importance of each molecular descriptor in the MLP 10-11-1 model, based on error reduction contribution in a sensitivity analysis. Each variable entering the model had an error quotient exceeding 1, indicating that all preselected descriptors were significant for the MLP 10-11-1 network. It would be excluded if a variable had no effect or worsened the network’s performance.

Table 3 Sensitivity analysis results for the MLP 10-11-1 networkFig. 4figure 4

Sensitivity analysis: feature importance of the top 10 descriptors

A series of previous studies on the search for selective 11β-hydroxysteroid dehydrogenase type 1 inhibitors indicate that the presence of large hydrophobic groups (adamantyl and cyclopentyl) in the 5-position of the thiazole ring positively affects the 11β-HSD1 inhibitory activity of compounds containing a pseudothiohydantoin scaffold [18,19,20,21,22,23]. In particular, the adamantyl group is known for positively modulating the therapeutic index of numerous experimental compounds, thereby enhancing the drug-like characteristics of a lead compound without increasing its toxicity. After carefully considering the above observations and the QSAR model built in the present study, 39 new potential inhibitors for 11β-hydroxysteroid dehydrogenase type 1 have been designed. The introduction of bulky hydrophobic groups, such as cyclohexyl and tetrahydropyran-methyl moieties, was rationalized by the significant contribution of 3D descriptors like RDF030u and GETAWAY indices (Table 3), which suggest favorable steric and shape complementarity with the enzyme’s binding site. These inhibitors are based on a pseudothiohydantoin scaffold containing cyclohexyl, 2-(tetrahydrofuran-2-yl)methyl, 2-(tetrahydro-2H-pyran-2-yl)methyl, cyclopropyl residues substituted at the amino group and differing in substituents at C-5 of the thiazole ring listed in Table 4.

Table 4 Designed 2-aminothiazol-4(5H)-one derivatives with 11β-HSD1 inhibitory activity predicted by the MLP 10-11-1 model

The obtained predictions show that 2-aminothiazol-4(5H)-one derivatives bearing cyclohexyl and 2-(tetrahydro-2H-pyran-2-yl)methyl moiety substituted at the amino group may produce the most promising 11β-hydroxysteroid dehydrogenase type 1 inhibitors. Among all designed analogs, the most active compounds are 7d, 7e, 7f, 7i, 7j, 8f, 8i, 9f, 9h, 9i, and 9j, which at the concentration of 10 μM may presumably inhibit the activity of isoform 1 by more than 80.00%.

The pseudothiohydantoin scaffold is essential and effective in medicinal chemical research and drug design. Numerous compounds with a 2-aminothiazol-4(5H)-one core have been synthesized with different substituents. Additionally, various therapeutic properties, including 11β-hydroxysteroid dehydrogenase type 1 inhibitory activity, have been described in scientific literature as mentioned above.

The present investigation has created a validated QSAR model using artificial neural networks. This model helps to solve regression problems between molecular structure and 11β‐HSD1 inhibitory activity of 2-aminothiazol-4(5H)-one derivatives. It has facilitated the design of new compounds and was used to predict the inhibitory activity towards 11β‐HSD1 of newly designed compounds (Table 4). The MLP 10-11-1 model includes ten of the most informative descriptors selected from a large dataset (Table 3 and Table 6) to prevent overtraining and identify the most critical molecular properties that affect the 11β‐HSD1 inhibitory activity of the pseudothiohydantoin-based compounds being studied. Based on the preselection process, descriptors related to atomic properties such as intrinsic state, atomic mass, polarizability, and four unweighted ones significantly impacted the inhibitory activity towards 11β-hydroxysteroid dehydrogenase type 1. These indices contain 2D and 3D information from the molecular structure, suggesting that the model has a reasonable level of predictability because the modeling activity is influenced by more than just a single molecular dimension.

Descriptors belonging to seven different classes (GALVEZ, 2D autocorrelations, 2D matrix-based descriptors, GETAWAY descriptors, Burden eigenvalues, 3D-MoRSE descriptors, and RDF descriptors) contributed to the modeling of 11β‐HSD1 inhibitory activity of the studied analogs. The first 3 descriptors, according to the sensitivity analysis conducted (Table 3), contribute most to the QSAR model and describe the two-dimensional structure of the molecules analyzed. The top-ranked descriptor GGI2 relates to electronic charge distribution, a key factor in enzyme binding, while MATS1s incorporates electronic and structural properties. These findings align with previous reports emphasizing the role of electrostatics and molecular topology in 11β-HSD1 inhibition [27,28,29].

GGI2 is the topological charge index of order 2 and is ranked first in the sensitivity analysis. It is part of the Galvez topological charge indices derived from the first ten eigenvalues of the corrected adjacency matrix of compounds [30, 31]. The GGIn descriptors are linked to the topological charge index of order n, where ‘n’ indicates the order of eigenvalue. The GGI2 descriptor assesses molecule charge transfer calculated from the adjacency topological matrix.

The MATS1s descriptor in the model mentioned above falls under the 2D autocorrelations category of Dragon descriptors [27]. It is based on the autocorrelation of the topological structure of Moran. Calculating this descriptor involves summing different autocorrelation functions for various fragment lengths, resulting in different autocorrelation vectors for the lengths of the structural fragments [28]. When calculating this descriptor, an intrinsic state is also used as a weighting component based on a physicochemical property. Therefore, MATS1s addresses the topology of the structure or its parts concerning the intrinsic state of the molecules being studied.

The third participating descriptor SpPosA_B(m) is derived from the Burden matrix, which is derived from the adjacency matrix, and weighted by atomic mass [32]. The adjacency matrix, also known as the vertex adjacency matrix, is a crucial tool for calculating molecular descriptors [33]. This is one of the essential graph theoretical matrices representing the entire set of connections between adjacent pairs of atoms. The adjacency matrix provides branching information, which may also be relevant for modeling pharmacological activity.

The fourth and fifth indices in the ranking list regarding relevance to the developed model belong to the GETAWAY (Geometry, Topology, and Atom-Weights AssemblY) descriptors. The R3v + and R2s descriptors reference 3D molecular geometry, atom-relatedness, and chemical information. These descriptors are computed using spatial autocorrelation formulas, connectivity indices, and traditional matrix operations [34]. Their computation involves analyzing the individual atoms within the molecule to measure their atomic van der Waals volume and intrinsic state, respectively. Past findings have indicated that GETAWAY descriptors can capture both local and distributed information regarding the molecular structure, making them a valuable asset in developing reliable models [35]. It has been observed that utilizing GETAWAY along with descriptors containing information on the entire molecular structure appears to create more accurate models when the property being modeled depends specifically on the 3D aspects of the molecule, such as in the case of biological activities. However, while all the molecular descriptor sets effectively model the octanol–water partition coefficient, GETAWAY and topological descriptors produce the most reliable regressions, as in the present study [36].

Descriptors VE1_D/Dt and VE2_B(p), being the sixth and last ranked, respectively, belong to the class of 2D matrix-based descriptors, as does the already mentioned descriptor SpPosA_B(m). VE1-D/Dt denotes the aggregate of coefficients in the last eigenvector obtained from the distance/detour matrix, derived from the distance and detour matrices. It is calculated as the length from the shortest to the longest path between vertices from the branching or cyclicity vertices and reflects substructure information about the cation sections [29]. The VE2_B(p) comprises the sum of the coefficients of the last eigenvector, weighted by polarizability. The value of VE2_B(p) increases with the number of graph vertices [37].

The seventh significant molecular descriptor SpMin8_Bh(m) is categorized under Burden eigenvalues. This index is computed using molecular graphs containing hydrogen atoms through the Burden matrix [38]. Generally, each matrix in the Dragon calculation has the first eight largest positive eigenvalues denoted as SpMaxk_Bh(w) and the first eight smallest negative eigenvalues denoted as SpMink_Bh(w). In this context, k represents the eigenvalue rank and w indicates the atomic property, such as atom polarizability (p), intrinsic state (s), and atomic mass (m) as seen in the case of SpMin8_Bh(m). This particular molecular descriptor accounts for molecular mass and can be utilized to address the challenge of identifying molecular structural similarity or diversity of studied compounds.

Mor05u is another significant index that is a part of MoRSE descriptors, which stand for Molecular Representation of Structures based on Electron diffraction. It represents the unweighted signal 05. 3D-MoRSE descriptors offer insights into compounds based on their 3D structure through a molecular transform derived from an equation employed in electron diffraction studies. By considering multiple atomic properties, these descriptors can effectively encode highly adaptable representations of a compound and mirror the three-dimensional arrangement of atoms within a molecule [29].

Lastly, RDF030u (Radial Distribution Function 3.0/unweighted) is a descriptor within the Radial Distribution Function (RDF) category. These indices gauge the likelihood of an atom existing within a spherical volume with a radius of R. They are determined using radial basis functions centered on various interatomic distances from 0.5 to 15.5 Å. RDF descriptors’ calculations consider the atoms’ quantity, properties, and the distances between them. The RDF030u descriptor is influenced by the molecular structure within a 3.0 Å atomic radius, relative to the mass center. Thus, it details the likelihood of atoms being distributed within a spherical area with the abovementioned radius. This may be a crucial consideration when analyzing the chemical properties of a compound [39, 40].

A limitation of this study is the lack of experimental validation of the predicted inhibitory activities. In addition, the dataset used is relatively small and limited to a congeneric series, which may restrict model generalizability to broader chemical spaces. The absence of external validation or prospective predictions limits our ability to assess how the model performs on unseen chemotypes. Moreover, reliance on in silico descriptors without integrating pharmacokinetic profiling reduces the translational potential of the proposed candidates. These limitations highlight the importance of future experimental work and validation using structurally diverse analogs. Future work will involve synthesis and biological evaluation of the most promising series of predicted compounds, i.e., pseudothiohydantoin compounds with cyclohexyl and 2-(tetrahydro-2H-pyran-2-yl)methyl residues substituted at the amino group.

Comments (0)

No login
gif