Volume 1, Issue 11 p. 484-490
Full Paper
Open Access

Rapid and Mild One-Flow Synthetic Approach to Unsymmetrical Sulfamides Guided by Bayesian Optimization

Naoto Sugisawa

Naoto Sugisawa

Department of Basic Medicinal Sciences, Graduate School of Pharmaceutical Sciences, Nagoya University, Furo-cho Chikusa-ku, Nagoya, 464-8601 Japan

Search for more papers by this author
Dr. Hiroki Sugisawa

Dr. Hiroki Sugisawa

Department of Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kakuma-machi, Kanazawa, Ishikawa, 920-1192 Japan

Search for more papers by this author
Dr. Yuma Otake

Dr. Yuma Otake

School of Life Science and Technology, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama, 226-8503 Japan

Search for more papers by this author
Prof. Dr. Roman V. Krems

Prof. Dr. Roman V. Krems

Department of Chemistry, University of British Columbia, Vancouver, British Columbia, V6T 1Z1 Canada

Stewart Blusson Quantum Matter Institute, University of British Columbia, Vancouver, British Columbia, V6T 1Z4 Canada

Search for more papers by this author
Prof. Dr. Hiroyuki Nakamura

Prof. Dr. Hiroyuki Nakamura

Laboratory for Chemistry and Life Science, Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsuta-cho Midori-ku, Yokohama, 226-8503 Japan

Search for more papers by this author
Prof. Dr. Shinichiro Fuse

Corresponding Author

Prof. Dr. Shinichiro Fuse

Department of Basic Medicinal Sciences, Graduate School of Pharmaceutical Sciences, Nagoya University, Furo-cho Chikusa-ku, Nagoya, 464-8601 Japan

Search for more papers by this author
First published: 18 August 2021
Citations: 13

Graphical Abstract

The first rapid and mild (5.1 s, 20 °C) one-flow synthesis of unsymmetrical sulfamides is achieved. The reaction conditions producing ≥75 % yield are identified by a machine learning approach using Bayesian optimization (BO). It is demonstrated that BO produces desired conditions with <20 experiments and provides relationships between reaction parameters and outputs.

Abstract

Bayesian optimization (BO) is regarded as an efficient approach that can identify optimal conditions using a restricted number of experiments. Despite demonstrated potential of BO, applications of BO-based approaches in synthetic organic chemistry remain limited. Herein, we achieved the first rapid and mild (5.1 s, 20 °C) one-flow synthesis of unsymmetrical sulfamides from inexpensive sulfuryl chloride. Undesired reactions were successfully suppressed and the risk in handling sulfuryl chloride was minimized by the use of micro-flow technology. The reaction conditions producing ≥75 % yield were identified by a machine learning approach based on BO. It was demonstrated that BO produced the desired reaction conditions with a small number of experiments (19 and 10 experiments) in the entire search space (10,500 combinations of reaction conditions). Gaussian process (GP) models produced by BO provided the relationships between combinations of reaction parameters and outputs (RCRPO).

1 Introduction

Optimization of reaction conditions is usually an inevitable process in synthetic organic chemistry. A typical optimization approach includes variation of individual reaction parameters while keeping the other parameters fixed (one variable at a time, OVAT).1 The OVAT approaches do not provide sufficient relationships between combinations of reaction parameters and reaction outputs (RCRPO). As such, they can fail to identify optimal conditions.1b Design of experiment (DOE) approaches may vary multiple parameters simultaneously and provide meaningful RCRPO.1 However, DOE approaches generally require a large number of experiments. Bayesian optimization (BO) is regarded as an efficient approach that can identify optimal reaction conditions using a restricted number of experiments.2 BO trains Bayesian machine learning surrogate models in an iterative approach aiming to enhance the predictive power of the surrogates. BO learns covariances between multiple parameters, thus yielding models capable of predicting RCRPO.3 Recently, BO-based reaction optimizations using sophisticated automated systems and a somewhat large number of experiments (hundreds to thousands) were reported.2e Lapkin, Bourne and coworkers reported BO of multiple objectives using a small number of data points (<100),2b, 2c uncovering relative contributions of parameters.2b-2d The RCRPO can be used for gaining valuable insights into reactions that can lead to scientific discoveries and design of efficient chemical processes. Despite this demonstrated potential of BO, applications of BO in synthetic organic chemistry remain limited.2 The present work applies BO to identify optimal reaction conditions for a new class of reactions with a very small number of experiments.

Unsymmetrical sulfamides F derived from different primary amines A and D are important as drugs and their candidates.4 Conventional acidic synthetic approaches from sulfuryl chloride (B) usually require multiple steps, high temperature and long reaction time conditions (Scheme 1a) and are not applicable to the synthesis of sulfamides containing acid-labile groups.5

Details are in the caption following the image

Previously reported synthetic approaches a)–b), and the approach developed in this study c) for unsymmetrical sulfamide F.

Although many alternative approaches that used E containing various leaving groups (LG=catechol,6 imidazole,7 oxazolidinone,8 fluorine and imidazole9) were developed to overcome these problems, they also require multiple steps,6, 7, 8, 9 high temperature,6, 7, 8, 9 long reaction time,7, 9 and expensive reagents9 (Scheme 1b).10 Sequential introduction of A and D against B under basic conditions is an ideal approach, however, it suffers from side reactions such as formation of symmetric sulfamide G5b, 11 and base-mediated conversion of sulfamoyl chloride C to sufonylimine H and the following dimerization and/or polymerization.12 Thus, this approach has been only applicable to specified substrates.13 Micro-flow technology14 allows precise control of both the time on the short scale (<1 s) by rapid mixing (milliseconds) and reaction temperature.15 Herein, we report the first rapid and mild one-flow synthesis of F via sequential introduction of A and D against B (Scheme 1c) under reaction conditions identified by BO.

2 Results and Discussion

Four reaction parameters were used for simultaneous optimization (Figure 1), yielding 10,500 (5×6×50×7) combinations. Scope of examined parameters as well as the number of points in each parameter was selected based on our experience in developing micro-flow processes using highly electrophilic species such as triphosgene.16 Even a slight difference of reaction time can influence the results in micro-flow processes using highly active species.15 Therefore, relatively large numbers of time points (50) were selected. The Scikit-learn machine learning python library was used for this study. The BO scheme employed uses Gaussian Process (GP) models as surrogates and includes:

Details are in the caption following the image

The parameters in the reaction for obtaining 5a. The reactor was connected to a back pressure regulator when W was 30 °C or 40 °C. Bn=benzyl.

  1. Initialization by five experiments with combinations of reaction conditions randomly defined using Latin hypercube sampling (LHS).

  2. Training of GP model by the results of the five experiments.

  3. Identifying subsequent reaction conditions by maximizing the upper confidence bound (UCB) acquisition function determined by the GP model from step [2].

  4. Performing new experiments under conditions thus identified.

  5. Repeating steps [2] through [4] until yield exceeds 75 %.

  6. target for the yield was set based on the previous report (average yield 68 %, 24 examples).10e, 10f

Figure 1 summarizes the implementation of the reaction conditions for obtaining unsymmetrical sulfamides 5a from sulfuryl chloride (1), benzylamine (2a), and isopropylamine (3a).17 V- and T-shape micromixers were connected to Teflon® tubing, and immersed in a water bath. Then, 1 (X M, 1.0 equiv) in CH2Cl2 and 2a (1.0 equiv) and base (pKa of conjugated acid=Y, 1.0 equiv) in CH2Cl2 are independently injected into the V-shape mixer and reacted at W °C for Z s. The resultant mixture and 3a (10 equiv) in CH2Cl2 are injected into the T-shape mixer. The resultant mixture is poured into a test tube. After a simple aqueous workup, the yield of 5a is determined via 1H NMR analysis.

We carried out the optimization twice (Table 1) with different sets of 5 initial experiments as generated by LHS. The following 14 and 5 experiments based on BO identified the same final conditions {X=0.3 M, Y=8.9 (Me2NBn), Z=0.1 s, W=20 °C (Figure 2c)} that produced 75 % yield. Although identified conditions were not proven to be optimal, the observed yield exceeded the target yield, therefore the optimization process was stopped. The second optimization required only 10 experiments in total. This was the result of a better initial guess of the parameters by the LHS (conditions of N=4 contained 3 optimal parameters). This procedure allows us to identify the optimal reaction conditions by sampling only 0.18 % (N=19) and 0.095 % (N=10) of the entire search space (N=10,500). This is attributable to a relatively simple and continuous landscape of objective functions for this reaction. In order to estimate the required number of experiments for BO of reaction conditions for the reactions studied here, 10,000 sets of 5 initial experiments were produced by LHS and their yields were predicted by the regression model that was constructed by the acquired experimental data shown in Table 1, Figure 2 and Figure 3. The following 10,000 times BOs were simulated and the average number of required experiments was determined to be 25.8 (Figure 2d, we also examined hyperparameters in these simulations. For details, see Supporting Information).

Table 1. BO-based optimization of reaction conditions for one-flow synthesis of 5a.

number of experiments N

X [M]

pKa of conjugated acid Y

Z [s]

W [°C]

Yield[a] [%]

the first optimization (N=19)

1

0.2

7.7

0.5

30

49

2

0.1

5.2

1.9

20

24

3

0.4

8.9

2.8

10

54

4

0.5

7.7

3.6

0

33

5

0.3

7.0

4.8

−20

44

6

0.4

8.9

0.1

40

70

7

0.4

8.9

5.0

40

42

8

0.4

8.9

0.1

−20

55

9

0.1

8.9

0.1

40

52

10

0.4

7.0

0.1

40

52

11

0.4

8.9

0.2

40

68

12

0.4

8.9

0.1

30

74

13

0.4

8.9

0.1

20

69

14

0.4

8.9

0.2

30

70

15

0.5

8.9

0.1

30

66

16

0.3

8.9

0.1

30

74

17

0.3

10.0

0.1

30

73

18

0.4

10.0

0.1

30

71

19

0.3

8.9

0.1

20

77

the second optimization (N=10)

1

0.2

8.9

3.7

−20

44

2

0.4

5.2

3.0

40

43

3

0.5

7.0

4.7

10

45

4

0.3

8.9

0.6

20

67

5

0.2

7.7

1.7

0

44

6

0.3

5.2

5.0

−20

33

7

0.3

8.9

5.0

−20

38

8

0.3

8.9

5.0

20

52

9

0.3

8.9

0.4

20

70

10

0.3

8.9

0.1

20

77

  • [a] Yields were determined via 1H NMR analysis using 1,1,2-trichloroethane as an internal standard.
Details are in the caption following the image

Transition of yields during the first optimization a) and second optimization b). c) Identified conditions. d) Required number of experiments for the optimizations that were predicted by simulation (10,000 times). Std.=standard deviation.

Details are in the caption following the image

The relationships between each reaction parameter and yields. Yields were determined via 1H NMR analysis using 1,1,2-tirichloroethane as an internal standard.

The relationships between each reaction parameter and the reaction yields are useful for obtaining insights into the reactions. In order to roughly estimate correlation patterns, the yields corresponding to each parameter were predicted by GP regression model with different N (Figure 2a: N=19 and Figure 2b: N=10). The predicted correlations were compared with the experimentally obtained correlations (We performed 10 additional experiments. For details, see Supporting Information). In the cases of concentration and basicity, the predicted trends agree well with the experimental results for N=19, on the other hand, the agreement is poor for N=10. In the case of reaction time and temperature, the agreement was good for both N=19 and 10. The relationships between the concentrations (X), basicities (Y), or temperatures (W) and yields show bell shapes (Figure 3a, 3b, and 3d). This illustrates that, for small values of X, Y and W, conversion of the substrate is insufficient which leads to lower yields. On the other end of the distribution, when values of X, Y and W are large, side reactions occur that also diminish yields. The yield also decreases as the reaction time (Z) increases (Figure 3c). This is probably due to instability of the intermediate 4a.12a, 12c

Figure 4 examines the effect of the interplay between concentrations and temperature on yields as predicted by GP model trained by each optimization in Table 1. The positive correlation between concentration and temperature is observed in Figure 4a (N=19). We speculated that the positive correlation indicated the importance of mixing efficiency in this reaction. Higher yield is predicted from both combinations of lower concentration and lower temperature, and higher concentration and higher temperature. Reportedly, the viscosity of CH2Cl2 increases as temperature decreases.18 Therefore, it is conceivable that viscosity of reaction mixture should be greater at lower temperature, thus requiring lower concentration to decrease viscosity. On the other hand, the viscosity of the reaction mixture should be lower at higher temperatures, allowing for higher concentrations under optimal conditions. The latter combination allowed us to achieve higher productivity. We note that the predicted contour map with N=10 is not accurate enough due to insufficient training information.

Details are in the caption following the image

Contour maps between combinations of concentration and reaction temperature and yields that were predicted by the GP regression model trained by 19 experiments a) and 10 experiments b).

In order to verify the importance of the micro-flow conditions, comparative batch conditions were carried out (3 independent experiments were carried out) as shown in Table 2. Quantities of compounds, solvents and temperature were identical to those of flow condition. Under the micro-flow conditions, reaction time was 0.1 s+5.0 s. However, under the batch conditions, it was impossible to operate the reaction in such a short time. Thus, the reaction time was extended to 10 s+10 s. Although the reaction mixture was vigorously stirred (1,000 rpm) during the experiment, the yields decreased by ca. 20 % probably due to insufficient control of reaction time and temperature (Table 2). These results corroborate the importance of mixing efficiency for this reaction suggested by the results in Figure 4a. Unexpectedly, the generation of the substantial amount of undesired symmetric sulfamide 6b was observed under batch conditions.

Table 2. Comparison between micro-flow and batch conditions for synthesis of 5f.

image

run

yield [%][a]

micro-flow conditions

batch conditions (1000 rpm)

5f

6a

6b

5f

6a

6b

1

68 (59[b])

9

7

49

7

24

2

71

8

9

48

7

25

3

67

8

7

47

7

25

  • [a] Yields were determined via 1H NMR analysis using 1,1,2-trichloroethane as an internal standard. [b] Isolated yield.

A plausible reaction mechanism is shown in Scheme 2. The first coupling reaction between sulfuryl chloride (I) and the first amine II afforded sulfamoyl chloride III. The subsequent coupling between III and the second amine IV afforded the desired asymmetric sulfamide V. Nucleophilic addition of II onto III leads to the formation of undesired symmetric sulfamide VI both under flow and batch conditions (Table 2). We believe that deprotonation of highly acidic sulfamoyl chloride III afforded sulfamoyl anion VII and subsequent elimination of Cl group afforded sulfonyl imine VIII. Reportedly, the formation of VIII from III rapidly occurs in the presence of an amine base even at −78 °C.12a It is known that the sulfonyl imine VIII spontaneously undergoes dimerization12c and/or polymerization.12a It is conceivable that these undesired reactions decreased a yield of V. In addition, we speculated that nucleophilic substitution of the anion VII against sulfuryl chloride also occurred to generate IX.12b Aminolysis of IX with the second amine afforded desired asymmetric sulfamide V and undesired symmetric sulfamide X.12b The insufficient control of residence time of III and reaction temperature under batch conditions causes undesired reactions via the anion VII that lead to decrease in yields (Scheme 2). It should be noted that batch conditions are not suitable for this reaction due to rapid evolution of dangerous gases such as HCl. However, performing the reactions under micro-flow conditions is safe.

Details are in the caption following the image

Plausible reaction mechanism of sulfamide formation.

The substrate scope was examined using the optimized reaction conditions (Scheme 3).19 The sequential introduction of various primary amines, including an amino acid, afforded the desired products 5al with acceptable to good yields (one-flow 2 steps, 45–71 %). It should be noted that 5m and 5n containing two acid-labile groups (Boc and allyl or dimethylacetal and allyl) that cannot be synthesized with conventional acidic approaches5 were obtained with good yields (52 % and 51 %). Although the observed yields of our developed approach were comparable to the overall yields of previously reported approaches, reaction time was significantly shorter. The use of expensive reagents was not necessary. Neither high nor low temperature conditions were required.

Details are in the caption following the image

One-flow synthesis of unsymmetrical sulfamides 5. Boc=tert-butoxycarbonyl. Isolated yields were shown.

3 Conclusion

In conclusion, the first rapid and mild (5.1 s, 20 °C) one-flow synthesis of unsymmetrical sulfamides was achieved. Undesired reactions were successfully suppressed and the risk in handling sulfuryl chloride was minimized by the use of micro-flow technology. Neither high nor low temperature conditions were required. In addition, the use of expensive reagents was not necessary. BO enabled the identification of targeted reaction conditions based on a small number of experiments (<20). We have shown that BO not only yields the optimal reaction conditions but also produces GP models that provide accurate multi-variate relationships between the reaction parameters and the reaction yield. These relationships can be used for gaining valuable insights into reactions that can lead to scientific discoveries and design of efficient chemical processes. We have demonstrated several of such insights for the title reaction, including the critical role of mixing, and consequently viscosity as determined by temperature and concentration.

Experimental Section

Details of Bayesian Optimization of Reaction Conditions

Gaussian Process Model

We use Gaussian Processes (GP) as surrogate models for Bayesian optimization (BO). The models are trained by n input-output pairs,
x i , y i | i = 1 , , n , (1)
with the inputs
x i = x 1 x D T (2)
describing D reaction conditions and the outputs y i – the reaction yields, as described in the main text. This assumes that the yield y is regarded to be a function of reaction conditions x ,
y = f x + ϵ , # (3)
where ϵ is Gaussian noise with zero mean and variance α . We model f x by a GP with the radial basis function (RBF) kernel,
k i j = k x i , x j = θ 0 exp k = 1 D x i k - x j k 2 θ k # (4)

where θ are the model hyperparameters determined by maximizing the logarithm of the marginal likelihood. We use models with fixed noise variance set to α = 0 . 001 to ensure that it does not change as Bayesian optimization progresses. A non-zero value of noise variance is equivalent to model regularization. As the purpose of GP models is simply to determine the acquisition functions for BO, we prefer to use GP models with fixed α throughout BO. The Supporting Information provides more information on the suitability of our choice of the value of α . The yield data were standardized to zero mean and variance of 1 before GP models were applied. The noise variance was set to α=0.00444 for N=19 and 0.00504 for N=10 for prediction of the relationships between each reaction parameter and the reaction yields and RCRPO. The variance was determined based on the standard deviation of our experiment (ca. 1 %). For details, see Supporting Information. Our source code for Bayesian optimization was constructed with the Scikit-learn machine learning python library (available at https://github.com/HirokiSugisawa/bo-flow).

Bayesian Optimization

The goal of BO is to optimize the reaction conditions x so that the yield y is maximized. Here, the yield as a function of any reaction conditions can be described using the GP model as,
y ^ n = k + T ( K + α I ) - 1 y , (5)
where K is the n × n covariance matrix with the matrix elements k i j given by the kernel function with the optimal parameters, and I is the identity matrix. The corresponding variance of the posterior GP is
σ ^ n = k + + - k + T ( K + α I ) - 1 k + , (6)
where the hat over the symbol denotes the predicted value,
k + = k x , x 1 k x , x n T , (7)
k + + = k x , x , (8)
and
y = y 1 y n . (9)
The predicted variance σ ^ represents the confidence of the predicted yields y ^ (the confidence is lower when σ ^ is larger). We use the upper confidence bound (UCB) approach that uses the following function
a x = y ^ n x + κ σ ^ n x (10)
to determine the next set of reaction conditions as
x n e x t = argmax x a x , (11)

where κ is an arbitrary parameter balancing exploration and exploitation. In the present work, κ was set to 0.1 (The choice of this value is justified in the Supporting Information). At each iteration, the reaction yield at x n e x t is added to the training set, n is reset to n + 1 , and the procedure is repeated until the desired yield is achieved.

General Procedure for One-Flow Synthesis of Unsymmetrical Sulfamide 5

A solution of sulfuryl chloride (1) (0.300 M, 1.00 equiv) in CH2Cl2 (flow rate: 2.40 mL/min), a solution of R1NH2 2 (0.180 M, 1.00 equiv), Me2NBn (0.180 M, 1.00 equiv) in CH2Cl2 (flow rate: 4.00 mL/min) were injected into the V-shape mixer at 20 °C with the syringe pumps. The resultant mixture was passed through the reaction tube 1 (inner diameter: 0.500 mm, length: 54.3 mm, volume: 10.7 μL, reaction time: 0.1 s) at the same temperature. The resultant mixture and a solution of R2R3NH 3 (0.900 M, 5.00 equiv) in CH2Cl2 (flow rate: 4.00 mL/min) were injected into the T-shape mixer at 20 °C with the syringe pumps. The resultant mixture was passed through the reaction tube 2 (inner diameter: 0.800 mm, length: 1725 mm, volume: 867 μL, reaction time: 5.0 s) at the same temperature. After being eluted for ca. 20 s to reach a steady state, the resultant mixture was poured into a test tube for 25 s at room temperature. The reaction mixture was washed with 1 M HCl aq., water, dried over MgSO4, filtered and concentrated in vacuo at room temperature. The residue was purified by silica gel column chromatography or preparative TLC or GPC.

Acknowledgements

This work was supported by the Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research: BINDS) from the Japan Agency for Medical Research and Development (AMED) under Grant Number JP20am0101099.

    Conflict of interest

    The authors declare no conflict of interest.