search, modifying the structure, manual removal or using NOT logic). This
example shows an effective strategy that uses Molecular Formulas (MF) to create
a more focused ..... IT 112-03-8 4292-10-8 9004-82-4 16693-53-1 28724-32-5.
The Value of SUBSET Searching in CAS REGISTRYSM Science IP®, The CAS Search Service.
There are times when a structure search of the full REGISTRY database may contain unanticipated answers. These answers can be removed in many ways (e.g., using a SUBSET search, modifying the structure, manual removal or using NOT logic). This example shows an effective strategy that uses Molecular Formulas (MF) to create a more focused SUBSET of the REGISTRY database that does not contain the unanticipated answers.
Search Example: Find references that cite shampoos or hair conditioners containing polyethylene glycols (PEG) with the following structure: CH2CH2 – [ -OCH2 CH2]n –OR | R-N+1-R | CH2CH2 – [ -OCH2 CH2]n –OR
R = ANYTHING
For this example, we first create a structure representing the desired PEGs using STN Express® before logging in to the online session. Note that the “R” groups are omitted since we will be conducting a sub-structure search (SSS) which allows for any substitution at all locations where it is not prohibited (e.g., by specifying H atoms). When saving a structure created in STN Express, we have the option to “Refine Using Structure Filters”. “Filters” (screen numbers) are numeric terms that represent requirements that may otherwise be difficult or impossible to specify in a structure drawing. In this example we added “filters” that required the presence of a charge (screen 2040) and the presence of a Structural Repeating Unit (SRU) with its end groups specified (screen 2069). When we upload the structure to our online session, STN automatically creates L-numbers for the screen numbers, the structure and the query combining these requirements. => FILE REGISTRY => SCREEN 2040 AND 2069 L1
SCREEN CREATED
=> Uploading Q:/2011 Science IP/PEGstructure.str L2 STRUCTURE UPLOADED
Created in STN Express.
=> QUE L2 AND L1 L3 QUE L2 AND L1 => DISPLAY QUE L1 SCR 2040 AND 2069 L2 STR
Example continued on the next page.
STN automatically creates a query (QUE) using the screens (L1) and uploaded structure (L2).
CH2
CH2
O
CH2
O CH2
N+ CH2
O CH2
CH2 CH2
O
Structure attributes must be viewed using STN Express query preparation. L3 QUE ABB=ON PLU=ON L2 AND L1
Now that the structure is uploaded and the appropriate screens are specified, we can search the REGISTRY database. We begin our SEARCH with default settings for a sample search (SAM) and substructure search (SSS). => SEARCH L3 'EXTEND' DOES NOT APPLY TO SAMPLE SEARCHES SAMPLE SEARCH INITIATED 21:37:54 SAMPLE SCREEN SEARCH COMPLETED -
EXTEND only applies when searching the full database.
182 TO ITERATE
100.0% PROCESSED 182 ITERATIONS INCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED) SEARCH TIME: 00.00.01
FULL FILE PROJECTIONS: PROJECTED ITERATIONS: PROJECTED ANSWERS: L4
50 ANSWERS
ONLINE **COMPLETE** BATCH **COMPLETE** 2831 TO 4449 624 TO 1496
50 SEA SSS SAM L2 AND L1
L4 contains the candidate answers.
In this search, 182 substances passed the initial screening and were processed in an atom by atom and bond by bond comparison until system limits were encountered (in this case when the maximum number of 50 answers for a sample search was reached). Full file projections give an estimate of anticipated results for searching in the full database and indicate that the FULL FILE search will run to completion. => D SCAN L4 IN MF CI
DISPLAY results using SCAN to verify the search.
50 ANSWERS REGISTRY COPYRIGHT 2011 ACS on STN Poly(oxy-1,2-ethanediyl), ,'-[(methyloctadecyliminio)di-2,1ethanediyl]bis[-hydroxy-'-(2-sulfoethoxy)-, inner salt (9CI) (C2 H4 O)n (C2 H4 O)n C25 H53 N O5 S PMS
-O3S
CH2
CH2
O
CH2
CH2
O
n
CH2
(CH2)17 CH2 N + Me CH2
CH2
PAGE 1-A Me
O PAGE 1-B PAGE 1-B
CH2
CH2
n
OH
Example continued on the next page.
CH2
CH2
n
OH
HOW MANY MORE ANSWERS DO YOU WISH TO SCAN? (1):1 L4 IN
MF CI
50 ANSWERS REGISTRY COPYRIGHT 2011 ACS on STN Poly(oxy-1,2-ethanediyl), -hydro--hydroxy-, ether with N-[4-[[4-[bis(2-hydroxyethyl)amino]phenyl](2-sulfophenyl)methylene]-2,5cyclohexadien-1-ylidene]-2-hydroxy-N-(2-hydroxyethyl)ethanaminium inner salt (4:1) (C2 H4 O)n (C2 H4 O)n (C2 H4 O)n (C2 H4 O)n C27 H32 N2 O7 S PMS PAGE 1-A
HO HO
CH2
CH2
CH2
CH2
O
O
n
n CH2
CH2
CH2
CH2
N
-O3S C
N + CH2 HO
CH2
CH2
O
n
CH2
CH2
CH2
O PAGE 1-B PAGE 1-B
CH2
CH2
CH2
CH2
n n
OH OH
This answer contains four PEG moieties – two more than were specified in the search question.
HOW MANY MORE ANSWERS DO YOU WISH TO SCAN? (1):0
These answers indicate that our answer set contains acceptable answers as well as some undesired answers. At this point, there are several options such as: Run the full database search and figure out how to remove the undesired answers. Modify the structure or create a separate structure to be excluded from the search using NOT logic. In this example, we chose another approach. We will create a SUBSET of the REGISTRY database that contains all acceptable answers but none of the undesired structures by using molecular formula information for both the acceptable answers and for the undesired answers. Notice that the MF for the acceptable PEGs begins with (C2H4O)n(C2H4O)n, and that the MF for the undesired PEGs begins with (C2H4O)n(C2H4O)n(C2H4O)n. Using truncation on these MFs, we create an L-number containing all PEGs containing 2 or more PEG groups, and an L-number containing all PEGs containing 3 or more PEG groups. [Note that since the MFs for polymers contain parenthesis, we must mask the parentheses with quote marks to avoid getting an online error message.] Finally, we create an L-number that excludes the second set from the first, resulting in an Lnumber that contains all possible answers of interest but none of the undesired answers. We can then search our structure against only these substances (a SUBSET search). We will verify the effectiveness of our strategy with a SAMPLE search. => S "(C2H4O)N(C2H4O)N"?/MF L5 10628 "(C2H4O)N(C2H4O)N"?/MF
MF combined with ? truncation symbol gives answers with 2 or more PEG.
=> S "(C2H4O)N(C2H4O)N(C2H4O)N"?/MF L6 3908 "(C2H4O)N(C2H4O)N(C2H4O)N"?/MF
Example continued on the next page.
Creates an L# for substances with 3 or more PEG moieties.
=> S L5 NOT L6 L7
L7 contains substances with exactly 2 PEG moieties.
6720 L5 NOT L6
=> S L3 SUBSET=L7 SAM 'EXTEND' DOES NOT APPLY TO SAMPLE SEARCHES SAMPLE SUBSET SEARCH INITIATED 21:39:28 SAMPLE SUBSET SCREEN SEARCH COMPLETED 100.0% PROCESSED SEARCH TIME: 00.00.01
32 TO ITERATE
32 ITERATIONS
25 ANSWERS
PROJECTIONS (WITHIN SPECIFIED SUBSET): PROJECTED ITERATIONS (WITHIN SPECIFIED SUBSET): PROJECTED ANSWERS (WITHIN SPECIFIED SUBSET): L8
ONLINE **COMPLETE** 301 TO 979 200 TO 800
25 SEA SUB=L7 SSS SAM L2 AND L1
=> D SCAN L8 IN
DISPLAY answers to verify the search.
25 ANSWERS REGISTRY COPYRIGHT 2011 ACS on STN Poly(oxy-1,2-ethanediyl), ,'-[[(2hydroxyethyl)methyliminio]di-2,1-ethanediyl]bis[-[(1oxoheptadecyl)oxy]- (9CI) (C2 H4 O)n (C2 H4 O)n C41 H82 N O5 PMS, COM
MF CI
O Me
(CH2)15
C
O
CH2
CH2
O
n
HO
CH2
CH2
CH2
PAGE 1-A
CH2 N + CH
2
CH2
Me
PAGE 1-B O O
CH2
CH2
n
O
C
(CH2)15
Me
HOW MANY MORE ANSWERS DO YOU WISH TO SCAN? (1):1 L8 IN MF CI
25 ANSWERS REGISTRY COPYRIGHT 2011 ACS on STN Poly(oxy-1,2-ethanediyl), ,'-[(methyloctadecyliminio)di-2,1ethanediyl]bis[-hydroxy-'-(2-sulfoethoxy)-, inner salt (9CI) (C2 H4 O)n (C2 H4 O)n C25 H53 N O5 S PMS
-O3S
CH2
CH2
O
CH2
CH2
O
n
CH2
CH2
(CH2)17 N + Me CH2
CH2
PAGE 1-A Me PAGE 1-B O PAGE 1-B
CH2
CH2
n
OH
CH CH OH HOW MANY 2MORE 2 ANSWERS n DO YOU WISH TO SCAN? (1):1 L8 IN MF
25 ANSWERS REGISTRY COPYRIGHT 2011 ACS on STN Poly(oxy-1,2-ethanediyl), ,'-[(methyloctyliminio)di-2,1ethanediyl]bis[-(acetyloxy)-, methyl sulfate (9CI) (C2 H4 O)n (C2 H4 O)n C17 H34 N O4 . C H3 O4 S
Example continued on the next page.
CM
1
PAGE 1-A AcO
CH2
CH2
O
n
CH2
CH2
(CH2)7 N + Me CH2
Me
CH2
O
CH2
PAGE 1-B
PAGE 1-B CH2 CM CH2
n
OAc
2
n
OAc Me
O
SO3-
Answers have two PEG moieties!
HOW MANY MORE ANSWERS DO YOU WISH TO SCAN? (1):0
To obtain all relevant structures, we run the structure search as a full SUBSET search. => S L3 SUBSET=L7 FULL FULL SUBSET SEARCH INITIATED 21:39:58 L9
603 SEA SUB=L7 SSS FUL L2 AND L1 EXTEND
CANDIDATE STRUCTURE SEARCH COMPLETED -
603 TO ITERATE
100.0% PROCESSED 603 ITERATIONS SEARCH TIME: 00.00.01 L10
401 ANSWERS
401 SEA SUB=L7 SSS FUL L2 AND L1
=> SAVE L10 SUBSETPEGREG/A ANSWER SET L10 HAS BEEN SAVED AS 'SUBSETPEGREG/A'
It is cost effective to save our answer set rather than run our search again later. We use our answer set (L10) from the full SUBSET search to search for references in the CAplus database. This example uses a simple strategy to search references for shampoos and hair conditioners. Use additional search terms for a more comprehensive search. => FILE CAPLUS; SEARCH L10 (L) (SHAMPOO OR HAIR TREATMENT OR HAIR CONDITION?) 527 L10 8750 SHAMPOO Time saving tips: 13796 SHAMPOOS 15551 SHAMPOO - Try using stacked commands. (SHAMPOO OR SHAMPOOS) - SET PLURALS ON PERM searches 86848 HAIR singular and plural forms. 6617 HAIRS 89811 HAIR (HAIR OR HAIRS) 2994625 TREATMENT 283889 TREATMENTS 3140601 TREATMENT (TREATMENT OR TREATMENTS) 1604 HAIR TREATMENT (HAIR(W)TREATMENT) 86848 HAIR 6617 HAIRS 89811 HAIR (HAIR OR HAIRS) 2749705 CONDITION? 5403 HAIR CONDITION? (HAIR(W)CONDITION?) L11 26 L10 (L) (SHAMPOO OR HAIR TREATMENT OR HAIR CONDITION?)
Example continued on the next page.
=> DISPLAY SCAN TI HITIND L11 TI IT
Verify answer quality by viewing the Title (TI) and Indexing Terms (IT) using DISPLAY SCAN TI HITIND.
26 ANSWERS HCAPLUS COPYRIGHT 2011 ACS on STN Hair treatment compositions 60-12-8, Phenyl ethyl alcohol 100-51-6, Benzenemethanol, biological studies 104-54-1 111-77-3, Diethylene glycol monomethyl ether 111-90-0, Diethylene glycol monoethyl ether 112-03-8 112-85-6, Behenic acid 122-99-6, Phenoxyethanol 622-08-2, 2-Benzyloxyethanol 1320-67-8, Methoxy propanol 17301-53-0 52125-53-8, Propanol, ethoxy 282107-97-5 282107-98-6 282107-99-7 RL: BUU (Biological use, unclassified); BIOL (Biological study); USES (Uses) (hair treatment compns. containing aliphatic acids and quaternary ammonium compds.)
Our substances from L10 and search terms are highlighted in red. HOW MANY MORE ANSWERS DO YOU WISH TO SCAN? (1):4 L11 TI IT
L11 TI IT
L11 TI IT
L11 TI IT
SCAN a few more answers.
26 ANSWERS HCAPLUS COPYRIGHT 2011 ACS on STN Detergent compositions exhibiting low skin irritation and containing a quaternary ammonium cationic surfactant and a carboxylate anionic surfactant 112-00-5, Dodecyltrimethylammonium chloride 112-02-7, Hexadecyltrimethylammonium chloride 112-03-8, Trimethylstearylammonium chloride 137-16-6, Sodium N-lauroylsarcosinate 629-25-4, Sodium laurate 683-10-3, Lauryldimethylammonioacetate 3010-24-0 9002-92-0 9004-98-2, Polyethyleneglycolmonooleyl ether 9036-19-5, Polyethyleneglycolmono(octylphenyl) ether 21539-58-2, N-Lauroyl-N-methyl--alanine sodium salt 28724-32-5 55535-58-5 73502-67-7, Potassium N-myristoylglutamate 101063-66-5 124080-05-3 124303-72-6 124425-08-7 124425-10-1 130711-82-9 130711-83-0 RL: USES (Uses) (detergents and shampoos containing, with low skin irritation) 26 ANSWERS HCAPLUS COPYRIGHT 2011 ACS on STN Hair conditioning compositions comprising quaternary ammonium-containing silicones and ethoxylated monoalkyl quats 112-02-7, Cetrimonium chloride 20182-63-2, Stearamidopropyl dimethylamine 28880-55-9 81859-24-7, Polyquaternium 10 RL: BUU (Biological use, unclassified); BIOL (Biological study); USES (Uses) (hair conditioning compns. comprising quaternary ammonium-containing silicones and ethoxylated monoalkyl quats) 26 ANSWERS HCAPLUS COPYRIGHT 2011 ACS on STN shampoos containing amphoteric, nonionic, cationic, and anionic surfactants 112-03-8 4292-10-8 9004-82-4 16693-53-1 28724-32-5 74541-88-1 151709-70-5 151709-72-7 RL: BIOL (Biological study) (conditioning shampoos containing) 26 ANSWERS HCAPLUS COPYRIGHT 2011 ACS on STN Hair conditioning compositions containing cationic surfactant and fatty alcohol 107-64-2, Dimethyldistearylammonium chloride 112-02-7, Cetrimonium chloride 17301-53-0, Behenyltrimonium chloride 28880-55-9, PEG-2 oleammonium chloride RL: COS (Cosmetic use); BIOL (Biological study); USES (Uses) (hair conditioning compns. containing cationic surfactant and fatty alc.)
HOW MANY MORE ANSWERS DO YOU WISH TO SCAN? (1):0 => SAVE L11 SUBSETPEGHCA/A ANSWER SET L11 HAS BEEN SAVED AS 'SUBSETPEGHCA/A'
The answers look good based on those scanned. Save, export or print the references.
Answers in CAplus can also show the structure using DISPLAY IBIB AB HITSTR. => DISPLAY 1 IBIB ABS HITSTR L10 ANSWER 1 OF 26 ACCESSION NUMBER: DOCUMENT NUMBER: TITLE:
CAPLUS COPYRIGHT 2012 ACS on STN 2003:568591 HCAPLUS Full-text 139:122440 Hair treatment composition containing silicone wax particles INVENTOR(S): Bracken, Gillian; Cunningham, Paul John; Neill, Paul H.; Tollerton, Sigrun PATENT ASSIGNEE(S): Unilever N.V., Neth.; Unilever P.L.C. SOURCE: Eur. Pat. Appl., 19 pp. CODEN: EPXXDW DOCUMENT TYPE: Patent LANGUAGE: English FAMILY ACC. NUM. COUNT: 1 PATENT INFORMATION: PATENT NO. KIND DATE APPLICATION NO. DATE --------------------------------------------------EP 1329213 A1 20030723 EP 2002-80264 20021216 EP 1329213 B1 20080305 R: AT, BE, CH, DE, DK, ES, FR, GB, GR, IT, LI, LU, NL, SE, MC, PT, IE, SI, LT, LV, FI, RO, MK, CY, AL, TR, BG, CZ, EE, SK AT 387908 T 20080315 AT 2002-80264 20021216 ES 2301605 T3 20080701 ES 2002-80264 20021216 JP 2003212735 A 20030730 JP 2002-367775 20021219 JP 4794803 B2 20111019 US 20030147828 A1 20030807 US 2003-346472 20030117 US 7347993 B2 20080325 PRIORITY APPLN. INFO.:
EP 2002-250400
A
20020121
ASSIGNMENT HISTORY FOR US PATENT AVAILABLE IN LSUS DISPLAY FORMAT AB Rinse off hair treatment compns. comprise particles wherein at least 90% by weight of the particles have an average maximum dimension of 10 nm-300 m, the particles comprising a silicone wax having 1 or more C340, branched or unbranched, saturated or unsatd., optionally substituted hydrocarbon groups and the wax having a m.p. of from 30-100°. The compns. can be used in a method of treating hair which comprises: applying to the hair particles wherein at least 90% by weight of the particles have an average maximum dimension of from 10-300 m, the particles comprising a silicone wax having one or more C3-40, branched or unbranched, saturated or unsatd., optionally substituted hydrocarbon groups and the wax having a m.p. of 30-100°; and heating the hair to a temperature above the m.p. of the particles. The method can act to condition the hair. Thus, hair conditioner composition contained Arquad 16-29 2.80, Arquad 2HT 0.50, Laurex CS 3.00, Natrosol HHR 0.20, EDTA 0.10, KCl 0.30, Abil-2440 5.00, preservative qs, and water to 100%. IT 28880-55-9, Ethoquad O 12PG RL: COS (Cosmetic use); BIOL (Biological study); USES (Uses) (Ethoquad O 12PG; hair treatment composition containing silicone wax particles) RN 28880-55-9 HCAPLUS CN Poly(oxy-1,2-ethanediyl), ,'-[[methyl-(9Z)-9-octadecen-1yliminio]di-2,1-ethanediyl]bis[-hydroxy-, chloride (1:1) (CA INDEX PAGE 1-A NAME)
HO HO
CH2
CH2
CH2
CH2
O O
n n
(CH2)8 CH CH2 CH2 N + Me (CH2)8 CH CH2 CH2 CH N +2 MeCH2 CH2
CH2
CH
(CH2)7 Me PAGE 1-A CH (CH2)7 Me O CH2 O
CH2
PAGE 1-B
Cl Cl -
CH2
n
OH
CH2 OH Example continued on the next page. n
PAGE 1-B
OS.CITING REF COUNT:
2
REFERENCE COUNT:
6
THERE ARE 2 CAPLUS RECORDS THAT CITE THIS RECORD (2 CITINGS) THERE ARE 6 CITED REFERENCES AVAILABLE FOR THIS RECORD. ALL CITATIONS AVAILABLE IN THE RE FORMAT
Here is a quick review of our search history to find shampoo and hair conditioner references containing PEG moieties with a specified structure. => DISPLAY HISTORY FULL FILE 'REGISTRY' L1 SCREEN 2040 AND 2069 L2 STRUCTURE UPLOADED L3 QUE ABB=ON PLU=ON L2 AND L1 D QUE L4 50 SEA SSS SAM L2 AND L1 L5 10628 SEA ABB=ON PLU=ON "(C2H4O)N(C2H4O)N"?/MF L6 3908 SEA ABB=ON PLU=ON "(C2H4O)N(C2H4O)N(C2H4O)N"?/MF L7 6720 SEA ABB=ON PLU=ON L5 NOT L6 L8 25 SEA SUB=L7 SSS SAM L2 AND L1 L9 603 SEA SUB=L7 SSS FUL L2 AND L1 EXTEND L10 401 SEA SUB=L7 SSS FUL L2 AND L1 SAVE L10 SUBSETPEGREG/A L11
FILE 'HCAPLUS' 26 SEA ABB=ON PLU=ON L10 (L) (SHAMPOO OR HAIR TREATMENT OR HAIR CONDITION?) SAVE L11 SUBSETPEGHCA/A