Designing carbamazepine crystallization, three steps with CovaSolv
Solvent screening, cooling curve, anti-solvent. Why acetone dissolves more at both temperatures but ethanol gives the higher yield, and how water as an anti-solvent crashes solubility by a factor of 220 without cooling a single degree.
Oliver Kraft
CovaSyn

TL;DR
- Step 1, solvent ranking at 25 °C. Acetone (≈ 33 g/L) and MEK (≈ 25 g/L) are the best solvents, water three orders of magnitude lower. Water stays in scope, it will be the lever later.
- Step 2, cooling curve 60 → 5 °C. Acetone dissolves more at both temperatures than ethanol. Ethanol still wins on yield: about 93 %, versus 80 % in acetone. The reason is residual solubility in the mother liquor at 5 °C.
- Step 3, anti-solvent water. Water in acetone at 25 °C drops solubility from 22 g/L to 0.1 g/L at 90 vol % water. A factor of 220, without changing temperature.
- What this means in practice. A pure 25 °C ranking from step 1 would have pointed to acetone. The counter-intuitive answer only shows up when the cooling curve and the mother liquor are in the model.
- Honesty. The numbers are CovaSolv predictions, not lab measurements. On the ICLR 2026 scaffold holdout CovaSolv sits at R² 0.92, that sets the accuracy floor for the design loop below.
What this is about
Designing a crystallization sounds like a textbook exercise and is in practice a series of trial-and-error loops with solvent orders, weekend setups, and a QC lab waiting for results. When the first two solvents do not dissolve the compound, or the cooling curve gives a bad yield, days are lost. With a validated solubility model up front, the design loop collapses to minutes at a screen before the first batch hits the lab.
What follows is exactly such a design loop for carbamazepine, a compound that is well covered in the literature and therefore works as a sanity check for the model.
Step 1, solvent ranking at 25 °C
Before designing a crystallization, you want to know in which solvent the compound is reasonably soluble in the first place. CovaSolv predicts that across nine common process solvents in one call.

The naive answer at this point would be: pick acetone or MEK, the solubility is highest there. That is wrong, and the next step shows why.
Step 2, cooling crystallization 60 → 5 °C
A cooling crystallization works on the temperature gradient of solubility. You dissolve as much as possible at a hot temperature, cool in a controlled way, the API drops out. Whatever stays in the mother liquor at the cold endpoint is lost yield.
CovaSolv gives solubility as a function of temperature for both ethanol and acetone.

Once you write the mass balance, the advantage reverses. The relative yield of a cooling crystallization scales with the ratio of hot-temperature to cold-temperature solubility, minus the mother liquor that stays at 5 °C.
In CovaSolv's modelled numbers:
- Ethanol: about 93 % yield between 60 °C and 5 °C.
- Acetone: about 80 % yield over the same span.
Acetone dissolves more, but also keeps more in the mother liquor at 5 °C. Ethanol stays lower in absolute terms, but its hot-to-cold ratio is better. That is the non-obvious point: anyone reading only the 25 °C table from step 1 runs into the wrong solvent.
Step 3, anti-solvent for the last few percent
With ethanol cooling, 93 % is recovered. The last 7 % sit in the mother liquor. If you want them too, without cooling further, you need an anti-solvent. We noted in step 1 that water dissolves carbamazepine three orders of magnitude worse than the organic solvents. That is exactly why it works as an anti-solvent.
The cleanest illustration is in acetone, which shows the steeper effect. CovaSolv predicts the acetone-water mixture solubility at 25 °C across the full range.

In practice you combine the two: cooling crystallization out of ethanol gives the 93 %, a downstream anti-solvent polishing step with water clears the rest from the mother liquor. Designing that early in the process saves solvent volumes and time later on.
What these three minutes of design loop deliver
The plot set above came out of CovaSolv tool calls in seconds. In practice: instead of three weeks of lab iteration for solvent selection, you get the structured recommendation in one chat turn and then verify with two small confirmation batches, not twenty blind ones.
Honesty
The numbers above are CovaSolv model predictions out of the MCP, not in-house lab measurements for carbamazepine. On the ICLR 2026 scaffold holdout CovaSolv sits at R² 0.92 and RMSE 0.64 log, which means 78 % of predictions land within 0.5 log. Before locking the process you run a small verification series in the lab. The design loop above is the fast path that comes first.
FAQ
Which solvent for cooling crystallization?
Not the one with the highest absolute solubility, the one with the best ratio of hot-temperature to cold-temperature solubility. For carbamazepine above: ethanol over acetone, because the mother liquor at 5 °C holds less residual solubility in ethanol.
How do you calculate crystallization yield?
Simplified: (S_hot - S_cold) / S_hot. The loss is what stays in the mother liquor at the cold endpoint. That makes the solvent with the lower cold-temperature solubility more attractive, even if it dissolves less at the hot end.
When does anti-solvent crystallization apply?
When residual solubility after cooling is still too high, or when the compound is temperature-sensitive and heat-cool cycles need to be avoided. Precondition is an anti-solvent in which the compound is markedly less soluble, three orders of magnitude is a good indicator.
Why does the solvent with the highest solubility not give the highest yield?
Because yield runs on the hot-to-cold ratio, not on absolute solubility. A solvent that is highly soluble both hot and cold leaves more API in the mother liquor.
How the architecture maps to this
CovaSolv is one of around 130 deterministic tools an AI agent can call through the CovaSyn MCP layer. The agent understands the question, plans the tool calls, CovaSolv does the math, the agent summarises the result. That separation between language and computation is the reason design loops like this work reliably today and do not collapse into LLM hallucination mode.
What comes next
The next entry in this series takes on the cleaning-risk data set, which feeds cleaning validation for GMP equipment, a different audience in QA and validation. The logic is the same: validated models up front, a few targeted lab confirmations at the end.
Try it yourself
The free tier ships 100 credits per week, all 130 tools included, no credit card required. Plugs into Claude Desktop, Cursor, ChatGPT (Custom GPT) or your own agent. /en/mcp.
CovaSyn MCP
Scientific tools in your AI workflow.
130+ functions for pharma, biotech and chemistry. Free tier instantly active.
