NCBI C Toolkit Cross Reference

C/config/README


  1 
  2 Beginning with Release 2.0 of the Entrez:Sequences and Pre-Release 1.0 of the
  3 Entrez:MEDLINE CD-ROMs, it is possible for the Entrez application to access
  4 one or both data sets. This will allow users to potentially have access to the
  5 rich set of MEDLINE articles on the Entrez:MEDLINE disc, while simultaneously
  6 having access to the sequence data on the Entrez:Sequences disc.
  7 
  8 This functionality requires both
  9  i) a new version of the Entrez application software, and
 10 ii) a more complex configuration file, to handle the wide variety of
 11     possible configurations.
 12 
 13 Users who wish to use only _one_ of the Entrez:Sequences or Entrez:MEDLINE discs
 14 (never in combination) will not need to make any changes in their configuration
 15 files; the new Entrez software is backwards compatible with these old
 16 configuration files. If you fall into this catagory and have already installed
 17 Entrez on your machine, then there is no need to read further.
 18 
 19 Note that the complex configuration mechanism which is discussed below is
 20 not currently supported. We would, however, appreciating hearing about any
 21 problems which you may encounter. A supported version with a user-friendly
 22 configuration program will be provided with Release 3.0 of the Entrez:Sequences
 23 CD-ROM.
 24 
 25 Also note that, because there will be no pre-Release 2.0 Entrez:MEDLINE CD-ROM,
 26 there will be a few MEDLINE abstracts which will be present on the latest
 27 Entrez:Sequences CD-ROM (Release 2.0), but not on the latest Entrez:MEDLINE
 28 CD-ROM (Pre-Release 1.0). This is a deviation from future CD-ROM releases,
 29 when the MEDLINE records on the latest Entrez:Sequences CD-ROM will be a
 30 proper subset of those sequences on the latest Entrez:MEDLINE CD-ROM.
 31 
 32 Users who wish to use both CD-ROMs on a single CD-ROM drive are advised to
 33 make sure that they have two CD-ROM caddies available (one for each CD).
 34 Frequent switching of CD-ROMs between a single caddy and the CD jewel boxes can
 35 induce high levels of stress.
 36 
 37 Seven pre-canned configuration files are made available, with corresponding
 38 .ini and .cnf files for MS Windows and the Macintosh, respectively. The
 39 selected file should be modified as appropriate for your machine, renamed
 40 to NCBI.INI or ncbi.cnf, and used to replace the NCBI.INI or ncbi.cnf file, as
 41 outlined on the Entrez manual's installation instructions.
 42 
 43 
 44 The pre-canned configuration files are as follows:
 45 
 46     NCBISM1D.XXX      MEDLINE and Sequence CDs, with only one CD-ROM drive
 47     NCBISM2D.XXX      MEDLINE and Sequence CDs, with two CD-ROM drives
 48     NCBIMO.XXX        MEDLINE CD only
 49     NCBISO.XXX        Sequence CD only
 50     NCBISHMC.XXX      Harddisk-based Sequence CD image, and MEDLINE CD-ROM
 51     NCBISCMH.XXX      Harddisk-based MEDLINE CD image, and Sequence CD-ROM
 52     NCBISHMH.XXX      MEDLINE CD and Sequence CD images, both on hard disk
 53 
 54 The following customizations may be necessary for your machine:
 55 
 56 * For all files, the only changes to be made will be within the first 20-or-so
 57   lines of the configuration file, within the "NCBI" section and the media
 58   sections "ENTREZ_xxx_yD" (where xxx is one of "SEQ" or "MED", and y is one of
 59   "C" or "H").
 60 
 61 * A copy of the CDROMDAT.VAL for each CD-ROM to be used must be stored on the
 62   hard disk, in the directory pointed to by the "VAL" field for the
 63   corresponding media. Note that these files are _different_ for the Sequence
 64   and MEDLINE CD-ROMs, and the correct file must be stored in each prescribed
 65   location.
 66 
 67 * For improved performance, a copy of the index files should be copied onto
 68   the hard disk for each CD-ROM to be used, if space is available. These
 69   canned configuration files assume the availability of such index files on a
 70   hard disk. If you choose not to install the index files, then remove the
 71   "IDX=" lines from the configuration file which you have selected. Again,
 72   note that the index files are _different_ for the Sequence and MEDLINE
 73   CD-ROMs, and the correct files must be stored in each prescribed location.
 74 
 75 * For MS-Windows, it is assumed that the first CD-ROM drive is drive D and the
 76   second CD-ROM drive (NCBISM2D.INI only) is drive E. Change this as necessary.
 77 
 78 * For Macintosh systems, the hard disk name yourHardDisk should be changed to
 79   the name of your hard disk.
 80 
 81 * For both types of systems, it is assumed that all hard disk files reside on
 82   the same hard disk. Change this as necessary.
 83 
 84 * If you choose to copy both CD-ROMs to your hard disk (NCBISHMH.XXX), then
 85   you need not copy the medline directory from the Entrez:Sequences CD-ROM
 86   onto your hard disk, since this portion of MEDLINE is a proper subset of
 87   the MEDLINE on the Entrez:MEDLINE CD-ROM.
 88 
 89 * Note that a copy of the appropriate seven configuration files for your
 90   platform (Mac or Windows) will be automatically copied by the installation
 91   procedure into the ENTREZ\CONFIG folder. You may, however, find an alternate
 92   copy of the configuration files on the CD-ROM in the SOFTWARE\CONFIG folder,
 93   inside the MAC and WIN folders.
 94 
 95 * Note that the "ROOT=" field in the "[NCBI]" section of the multi-source
 96   configuration files (ncbi????.XXX) is not used by this version of Entrez,
 97   but is provided for backwards compatability with older versions of Entrez,
 98   as well as for compatability with other applications using the older
 99   version of our data access libraries.
100 
101 * Because the MEDLINE entries on Release 2.0 of the Entrez:Sequences CD-ROM
102   are not a proper subset of the entries on pre-Release 1.0 of the
103   Entrez:MEDLINE CD-ROM, it may be necessary to set an additional configuration
104   parameter, if you begin to encounter "Missing UID" errors when running
105   Entrez. This is the only parameter which should be set in the
106   entrez.[cnf/ini] configuration file; all other parameters should be
107   set in the ncbi.[cnf/ini] configuration file. The parameter which should
108   be set is "SHOWALLERRORS=FALSE", in the [PREFERENCES] section. It is
109   strongly recommended that you first get your configuration of Entrez
110   running properly, before adding this line to your entrez.[cnf/ini] file.
111   This configuration option will mute many errors, which may make it
112   difficult to debug your configuration difficulties.
113   
114 
115 
116 EXAMPLE
117 
118 You wish to use both CD-ROMs, on a single CD-ROM drive, under MS Windows.
119 Suppose that the device for your CD-ROM drive is named "F:", not "D:"
120 
121 Install Entrez from Release 2.0 of the Entrez:Sequences CD-ROM, per the
122 installation/update instructions in the manual.
123 
124 Make a copy of ncbism1d.ini from \ENTREZ\CONFIG\NCBISM1D.INI, first having
125 saved a copy of NCBI.INI (if you had one):
126 
127     COPY C:\WIN\NCBI.INI C:\WIN\NCBIINI.BAK
128     COPY C:\ENTREZ\CONFIG\NCBISM1D.INI C:\WIN\NCBI.INI
129 
130 Edit C:\WIN\NCBI.INI with your favorite editor, and change the occurrences
131 of ROOT=D:\ to ROOT=F:\. Save your changes and exit the editor.
132 
133 Create some directories, if they don't already exist:
134     MKDIR C:\ENTREZ\MED
135     MKDIR C:\ENTREZ\MED\INDEX
136     MKDIR C:\ENTREZ\SEQ
137     MKDIR C:\ENTREZ\SEQ\INDEX
138 
139 Copy the sequence index files, and CDROMDAT.VAL to your hard disk.
140     COPY F:\INDEX\*.* C:\ENTREZ\SEQ\INDEX
141     COPY F:\CDROMDAT.VAL C:\ENTREZ\SEQ
142 
143 Now, eject the Entrez:Sequences CD, insert the Entrez:MEDLINE CD, and
144 copy the MEDLINE index files and CDROMDAT.VAL to your hard disk.
145     COPY F:\INDEX\*.* C:\ENTREZ\MED\INDEX
146     COPY F:\CDROMDAT.VAL C:\ENTREZ\MED
147 
148 Now, start-up Windows (if it's not already running), and launch Entrez.
149 It doesn't matter which CD-ROM, if any, is inserted into the CD-ROM drive
150 (although, for convenience, it generally makes sense to insert the CD-ROM
151 which you would like to use first). The Entrez application will inform
152 you when it is time to insert the other CD-ROM.
153 
154 NOTES
155 
156 When using two CD-ROMs on a single CD-ROM drive, the Macintosh version will
157 automatically eject the CD-ROM which is currently inserted. Ejection
158 must be performed manually for the Microsoft Windows version.
159 
160 Ejecting a CD-ROM at times other than that directed by the Entrez application
161 may result in undesirable effects.
162 
163 If you _must_ eject a CD-ROM on the Macintosh when Entrez is running, it
164 is important to drag the CD-ROM icon to the trash can, rather than using
165 the Eject selection from the Finder's FILE menu. The latter may result in 
166 undesirable effects.
167 
168 
169 
170 The remainder of this document is a technical discussion which should not
171 be necessary for the reader who only wants to install Entrez on their system.
172 
173 
174 ----------------------------------------------------------------------------
175 
176                            TECHNICAL DISCUSSION
177 
178 The new configuration files consist of a three-level structure. This
179 hierarchy is implemented by using unique user-specified names for sections
180 within the configuration file, as well as some reserved section names.
181 
182 The top-level of the hierarchy consists of three reserved-named sections,
183 "MEDLINE", "SEQUENCE", and "LINKS", each of which contain a single _field_,
184 "CHANNELS". Channels are used to specify the mechanisms by which the
185 corresponding types of data can be obtained. For example, considering the
186 Entrez:MEDLINE and Entrez:Sequences CDs, it is possible to obtain some
187 MEDLINE information from either CD, but Sequence information may only be
188 obtained from the Entrez:Sequences CD. Therefore, the value of the channels
189 field for MEDLINE will contain two user-defined channel names, but the value
190 of the channels field for SEQUENCE will only contain one such channel name.
191 
192 Each name listed on the right-hand-side of "CHANNELS=" must corresponding
193 to a section-name at the second-level of the hierarchy; the "Channels"
194 level. Each Channel-level entry consists of a list of priorities for 
195 the possible types of data associated with that channel. Priorities are used
196 by the Entrez software to determine which Channel it should attempt to
197 use for obtaining the corresponding data. A priority of 0 indicates that
198 this channel should never be used to obtain this data. For positive values,
199 a higher priority indicates a preference for that data channel. For example,
200 the Channel for obtaining MEDLINE records from the Entrez:Sequences disc
201 might have priority 1, while the corresponding Channel associated with
202 the Entrez:MEDLINE disc might have priority 2, because the latter is a better
203 source for this data. The integer-valued priorities may optionally be
204 followed by a comma and the keyword "NO_DRASTIC_ACTION". This means that,
205 if the priority for this channel is higher than any other, but a "drastic
206 action" would need to be taken to make this channel active (like ejecting
207 a different CD-ROM), then a channel with lower priority may be deferred to
208 (e.g., if the corresponding CD-ROM is currently inserted).
209 
210 The possible data types for MEDLINE and SEQUENCE channels are:
211     RECORDS   - Entire MEDLINE abstracts, or Sequence entries; these
212                 corresponding to double-clicking on a document summary
213                 in the Documents window
214     DOCSUMS   - A document summary; these appear as a scrolled list in
215                 the document summary window
216     TERMS     - These are the terms specified in term selection in the
217                 Query window
218     BOOLEANS  - This is the operation performed during query refinement
219 The default priority for an unreferenced data type (e.g., TERMS) is 1.
220 
221 A LINKS channel consists of an "INFO" priority, used to access global
222 information about Entrez status, and a set of links relationships. Links
223 relationships are specified by the "from" name, followed by two underscores,
224 followed by the "to" name. For example, the name for MEDLINE to SEQUENCE
225 links is "MEDLINE__SEQUENCE". The default priority for these double-underscored
226 names is 0, while the default priority for "INFO" is 1.
227 
228 Each Channel section must also contain a "MEDIA" field. This references
229 the lowest-level level in the hierarchy, Media.
230 
231 A Media, in turn, contains a "TYPE" field (currently this value must be either
232 CD or HARDDISK), and a set of fields which correspond to much of the original
233 set of fields which appeared in the "NCBI" section of the old-style
234 configuration field (namely: IDX, ROOT, etc.). There is an additional field,
235 "VAL", which must point to the filename for CDROMDAT.VAL. The default value for
236 "VAL" is the value specified by "ROOT".
237 
238 A media section must also contain the field "FORMAL_NAME", which is
239 the formal name to be used for that media when the software addresses the
240 user (e.g. "Entrez:MEDLINE CD-ROM"). A media section may also contain
241 one or more fields of the form "DRASTIC_TO_mmm=1", where mmm is the
242 section name of another media. This is used in conjunction with the
243 "NO_DRASTIC_ACTION" option which may appear in some Channel fields.
244 For example, within a Media section "ENTREZ_MED_CD", the field
245 "DRASTIC_TO_ENTREZ_SEQ_CD" means that it is considered to be a drastic
246 action to switch from the Entrez:MEDLINE CD-ROM to the Entrez:Sequences
247 CD-ROM.
248 
249 The "NCBI" section  must contain the DATA and ASNLOAD entries, indicating
250 where the data files and ASN.1 object loader definitions are to be found.
251 In addition, the "NCBI" section must contain a "MEDIA" field, which is
252 a comma-separated list of all the Media which will be used. Note that
253 this constitutes a deviation from the 3-level model mentioned earlier.
254 
255 A discussion of the pathname redirection used in both old and new-style
256 configuration files is in order here, since it has never been fully
257 documented on earlier CD-ROMs or CD-ROM documentation. The pathname
258 specification field names are: "ROOT", "IDX", "TRM", "MED", "SEQ", "LNK".
259 All pathnames default as being relative to the directory specified by
260 "ROOT", which is a mandatory field. The remaining fields, which are all
261 optional, override the pathname specified by "ROOT" for a specific set of files,
262 as follows:
263 * IDX    - Index files
264 * TRM    - Term list files (and their associated indices and "posting files")
265 * MED    - ASN.1 data for MEDLINE documents
266 * SEQ    - ASN.1 data for sequence documents
267 * LNK    - Links among related documents
268 * 

source navigation ]   [ diff markup ]   [ identifier search ]   [ freetext search ]   [ file search ]  

This page was automatically generated by the LXR engine.
Visit the LXR main site for more information.