NCBI C++ Toolkit Cross Reference

C++/src/util/regexp/NON-UNIX-USE


  1 Compiling PCRE on non-Unix systems
  2 ----------------------------------
  3 
  4 This document contains the following sections:
  5 
  6   General
  7   Generic instructions for the PCRE C library
  8   The C++ wrapper functions
  9   Building for virtual Pascal
 10   Stack size in Windows environments
 11   Linking programs in Windows environments
 12   Comments about Win32 builds
 13   Building PCRE on Windows with CMake
 14   Use of relative paths with CMake on Windows
 15   Testing with runtest.bat
 16   Building under Windows with BCC5.5
 17   Building PCRE on OpenVMS
 18 
 19 
 20 GENERAL
 21 
 22 I (Philip Hazel) have no experience of Windows or VMS sytems and how their
 23 libraries work. The items in the PCRE distribution and Makefile that relate to
 24 anything other than Unix-like systems are untested by me.
 25 
 26 There are some other comments and files (including some documentation in CHM
 27 format) in the Contrib directory on the FTP site:
 28 
 29   ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Contrib
 30 
 31 If you want to compile PCRE for a non-Unix system (especially for a system that
 32 does not support "configure" and "make" files), note that the basic PCRE
 33 library consists entirely of code written in Standard C, and so should compile
 34 successfully on any system that has a Standard C compiler and library. The C++
 35 wrapper functions are a separate issue (see below).
 36 
 37 The PCRE distribution includes a "configure" file for use by the Configure/Make
 38 build system, as found in many Unix-like environments. There is also support
 39 support for CMake, which some users prefer, in particular in Windows
 40 environments. There are some instructions for CMake under Windows in the
 41 section entitled "Building PCRE with CMake" below. CMake can also be used to
 42 build PCRE in Unix-like systems.
 43 
 44 
 45 GENERIC INSTRUCTIONS FOR THE PCRE C LIBRARY
 46 
 47 The following are generic comments about building the PCRE C library "by hand".
 48 
 49  (1) Copy or rename the file config.h.generic as config.h, and edit the macro
 50      settings that it contains to whatever is appropriate for your environment.
 51      In particular, if you want to force a specific value for newline, you can
 52      define the NEWLINE macro. When you compile any of the PCRE modules, you
 53      must specify -DHAVE_CONFIG_H to your compiler so that config.h is included
 54      in the sources.
 55 
 56      An alternative approach is not to edit config.h, but to use -D on the
 57      compiler command line to make any changes that you need to the
 58      configuration options. In this case -DHAVE_CONFIG_H must not be set.
 59 
 60      NOTE: There have been occasions when the way in which certain parameters
 61      in config.h are used has changed between releases. (In the configure/make
 62      world, this is handled automatically.) When upgrading to a new release,
 63      you are strongly advised to review config.h.generic before re-using what
 64      you had previously.
 65 
 66  (2) Copy or rename the file pcre.h.generic as pcre.h.
 67 
 68  (3) EITHER:
 69        Copy or rename file pcre_chartables.c.dist as pcre_chartables.c.
 70 
 71      OR:
 72        Compile dftables.c as a stand-alone program (using -DHAVE_CONFIG_H if
 73        you have set up config.h), and then run it with the single argument
 74        "pcre_chartables.c". This generates a set of standard character tables
 75        and writes them to that file. The tables are generated using the default
 76        C locale for your system. If you want to use a locale that is specified
 77        by LC_xxx environment variables, add the -L option to the dftables
 78        command. You must use this method if you are building on a system that
 79        uses EBCDIC code.
 80 
 81      The tables in pcre_chartables.c are defaults. The caller of PCRE can
 82      specify alternative tables at run time.
 83 
 84  (4) Ensure that you have the following header files:
 85 
 86        pcre_internal.h
 87        ucp.h
 88 
 89  (5) Also ensure that you have the following file, which is #included as source
 90      when building a debugging version of PCRE, and is also used by pcretest.
 91 
 92        pcre_printint.src
 93 
 94  (6) Compile the following source files, setting -DHAVE_CONFIG_H as a compiler
 95      option if you have set up config.h with your configuration, or else use
 96      other -D settings to change the configuration as required.
 97 
 98        pcre_chartables.c
 99        pcre_compile.c
100        pcre_config.c
101        pcre_dfa_exec.c
102        pcre_exec.c
103        pcre_fullinfo.c
104        pcre_get.c
105        pcre_globals.c
106        pcre_info.c
107        pcre_maketables.c
108        pcre_newline.c
109        pcre_ord2utf8.c
110        pcre_refcount.c
111        pcre_study.c
112        pcre_tables.c
113        pcre_try_flipped.c
114        pcre_ucd.c
115        pcre_valid_utf8.c
116        pcre_version.c
117        pcre_xclass.c
118 
119      Make sure that you include -I. in the compiler command (or equivalent for
120      an unusual compiler) so that all included PCRE header files are first
121      sought in the current directory. Otherwise you run the risk of picking up
122      a previously-installed file from somewhere else.
123 
124  (7) Now link all the compiled code into an object library in whichever form
125      your system keeps such libraries. This is the basic PCRE C library. If
126      your system has static and shared libraries, you may have to do this once
127      for each type.
128 
129  (8) Similarly, compile pcreposix.c (remembering -DHAVE_CONFIG_H if necessary)
130      and link the result (on its own) as the pcreposix library.
131 
132  (9) Compile the test program pcretest.c (again, don't forget -DHAVE_CONFIG_H).
133      This needs the functions in the pcre and pcreposix libraries when linking.
134      It also needs the pcre_printint.src source file, which it #includes.
135 
136 (10) Run pcretest on the testinput files in the testdata directory, and check
137      that the output matches the corresponding testoutput files. Note that the
138      supplied files are in Unix format, with just LF characters as line
139      terminators. You may need to edit them to change this if your system uses
140      a different convention. If you are using Windows, you probably should use
141      the wintestinput3 file instead of testinput3 (and the corresponding output
142      file). This is a locale test; wintestinput3 sets the locale to "french"
143      rather than "fr_FR", and there some minor output differences.
144 
145 (11) If you want to use the pcregrep command, compile and link pcregrep.c; it
146      uses only the basic PCRE library (it does not need the pcreposix library).
147 
148 
149 THE C++ WRAPPER FUNCTIONS
150 
151 The PCRE distribution also contains some C++ wrapper functions and tests,
152 contributed by Google Inc. On a system that can use "configure" and "make",
153 the functions are automatically built into a library called pcrecpp. It should
154 be straightforward to compile the .cc files manually on other systems. The
155 files called xxx_unittest.cc are test programs for each of the corresponding
156 xxx.cc files.
157 
158 
159 BUILDING FOR VIRTUAL PASCAL
160 
161 A script for building PCRE using Borland's C++ compiler for use with VPASCAL
162 was contributed by Alexander Tokarev. Stefan Weber updated the script and added
163 additional files. The following files in the distribution are for building PCRE
164 for use with VP/Borland: makevp_c.txt, makevp_l.txt, makevp.bat, pcregexp.pas.
165 
166 
167 STACK SIZE IN WINDOWS ENVIRONMENTS
168 
169 The default processor stack size of 1Mb in some Windows environments is too
170 small for matching patterns that need much recursion. In particular, test 2 may
171 fail because of this. Normally, running out of stack causes a crash, but there
172 have been cases where the test program has just died silently. See your linker
173 documentation for how to increase stack size if you experience problems. The
174 Linux default of 8Mb is a reasonable choice for the stack, though even that can
175 be too small for some pattern/subject combinations.
176 
177 PCRE has a compile configuration option to disable the use of stack for
178 recursion so that heap is used instead. However, pattern matching is
179 significantly slower when this is done. There is more about stack usage in the
180 "pcrestack" documentation.
181 
182 
183 LINKING PROGRAMS IN WINDOWS ENVIRONMENTS
184 
185 If you want to statically link a program against a PCRE library in the form of
186 a non-dll .a file, you must define PCRE_STATIC before including pcre.h,
187 otherwise the pcre_malloc() and pcre_free() exported functions will be declared
188 __declspec(dllimport), with unwanted results.
189 
190 
191 CALLING CONVENTIONS IN WINDOWS ENVIRONMENTS
192 
193 It is possible to compile programs to use different calling conventions using
194 MSVC. Search the web for "calling conventions" for more information. To make it
195 easier to change the calling convention for the exported functions in the
196 PCRE library, the macro PCRE_CALL_CONVENTION is present in all the external
197 definitions. It can be set externally when compiling (e.g. in CFLAGS). If it is
198 not set, it defaults to empty; the default calling convention is then used
199 (which is what is wanted most of the time).
200 
201 
202 COMMENTS ABOUT WIN32 BUILDS (see also "BUILDING PCRE WITH CMAKE" below)
203 
204 There are two ways of building PCRE using the "configure, make, make install"
205 paradigm on Windows systems: using MinGW or using Cygwin. These are not at all
206 the same thing; they are completely different from each other. There is also
207 support for building using CMake, which some users find a more straightforward
208 way of building PCRE under Windows. However, the tests are not run
209 automatically when CMake is used.
210 
211 The MinGW home page (http://www.mingw.org/) says this:
212 
213   MinGW: A collection of freely available and freely distributable Windows
214   specific header files and import libraries combined with GNU toolsets that
215   allow one to produce native Windows programs that do not rely on any
216   3rd-party C runtime DLLs.
217 
218 The Cygwin home page (http://www.cygwin.com/) says this:
219 
220   Cygwin is a Linux-like environment for Windows. It consists of two parts:
221 
222   . A DLL (cygwin1.dll) which acts as a Linux API emulation layer providing
223     substantial Linux API functionality
224 
225   . A collection of tools which provide Linux look and feel.
226 
227   The Cygwin DLL currently works with all recent, commercially released x86 32
228   bit and 64 bit versions of Windows, with the exception of Windows CE.
229 
230 On both MinGW and Cygwin, PCRE should build correctly using:
231 
232   ./configure && make && make install
233 
234 This should create two libraries called libpcre and libpcreposix, and, if you
235 have enabled building the C++ wrapper, a third one called libpcrecpp. These are
236 independent libraries: when you like with libpcreposix or libpcrecpp you must
237 also link with libpcre, which contains the basic functions. (Some earlier
238 releases of PCRE included the basic libpcre functions in libpcreposix. This no
239 longer happens.)
240 
241 A user submitted a special-purpose patch that makes it easy to create
242 "pcre.dll" under mingw32 using the "msys" environment. It provides "pcre.dll"
243 as a special target. If you use this target, no other files are built, and in
244 particular, the pcretest and pcregrep programs are not built. An example of how
245 this might be used is:
246 
247   ./configure --enable-utf --disable-cpp CFLAGS="-03 -s"; make pcre.dll
248 
249 Using Cygwin's compiler generates libraries and executables that depend on
250 cygwin1.dll. If a library that is generated this way is distributed,
251 cygwin1.dll has to be distributed as well. Since cygwin1.dll is under the GPL
252 licence, this forces not only PCRE to be under the GPL, but also the entire
253 application. A distributor who wants to keep their own code proprietary must
254 purchase an appropriate Cygwin licence.
255 
256 MinGW has no such restrictions. The MinGW compiler generates a library or
257 executable that can run standalone on Windows without any third party dll or
258 licensing issues.
259 
260 But there is more complication:
261 
262 If a Cygwin user uses the -mno-cygwin Cygwin gcc flag, what that really does is
263 to tell Cygwin's gcc to use the MinGW gcc. Cygwin's gcc is only acting as a
264 front end to MinGW's gcc (if you install Cygwin's gcc, you get both Cygwin's
265 gcc and MinGW's gcc). So, a user can:
266 
267 . Build native binaries by using MinGW or by getting Cygwin and using
268   -mno-cygwin.
269 
270 . Build binaries that depend on cygwin1.dll by using Cygwin with the normal
271   compiler flags.
272 
273 The test files that are supplied with PCRE are in Unix format, with LF
274 characters as line terminators. It may be necessary to change the line
275 terminators in order to get some of the tests to work. We hope to improve
276 things in this area in future.
277 
278 
279 BUILDING PCRE ON WINDOWS WITH CMAKE
280 
281 CMake is an alternative build facility that can be used instead of the
282 traditional Unix "configure". CMake version 2.4.7 supports Borland makefiles,
283 MinGW makefiles, MSYS makefiles, NMake makefiles, UNIX makefiles, Visual Studio
284 6, Visual Studio 7, Visual Studio 8, and Watcom W8. The following instructions
285 were contributed by a PCRE user.
286 
287 1.  Download CMake 2.4.7 or above from http://www.cmake.org/, install and ensure
288     that cmake\bin is on your path.
289 
290 2.  Unzip (retaining folder structure) the PCRE source tree into a source
291     directory such as C:\pcre.
292 
293 3.  Create a new, empty build directory: C:\pcre\build\
294 
295 4.  Run CMakeSetup from the Shell envirornment of your build tool, e.g., Msys
296     for Msys/MinGW or Visual Studio Command Prompt for VC/VC++
297 
298 5.  Enter C:\pcre\pcre-xx and C:\pcre\build for the source and build
299     directories, respectively
300 
301 6.  Hit the "Configure" button.
302 
303 7.  Select the particular IDE / build tool that you are using (Visual Studio,
304     MSYS makefiles, MinGW makefiles, etc.)
305 
306 8.  The GUI will then list several configuration options. This is where you can
307     enable UTF-8 support, etc.
308 
309 9.  Hit "Configure" again. The adjacent "OK" button should now be active.
310 
311 10. Hit "OK".
312 
313 11. The build directory should now contain a usable build system, be it a
314     solution file for Visual Studio, makefiles for MinGW, etc.
315 
316 
317 USE OF RELATIVE PATHS WITH CMAKE ON WINDOWS
318 
319 A PCRE user comments as follows:
320 
321 I thought that others may want to know the current state of
322 CMAKE_USE_RELATIVE_PATHS support on Windows.
323 
324 Here it is:
325 -- AdditionalIncludeDirectories is only partially modified (only the
326 first path - see below)
327 -- Only some of the contained file paths are modified - shown below for
328 pcre.vcproj
329 -- It properly modifies
330 
331 I am sure CMake people can fix that if they want to. Until then one will
332 need to replace existing absolute paths in project files with relative
333 paths manually (e.g. from VS) - relative to project file location. I did
334 just that before being told to try CMAKE_USE_RELATIVE_PATHS. Not a big
335 deal.
336 
337 AdditionalIncludeDirectories="E:\builds\pcre\build;E:\builds\pcre\pcre-7.5;"
338 AdditionalIncludeDirectories=".;E:\builds\pcre\pcre-7.5;"
339 
340 RelativePath="pcre.h">
341 RelativePath="pcre_chartables.c">
342 RelativePath="pcre_chartables.c.rule">
343 
344 
345 TESTING WITH RUNTEST.BAT
346 
347 1. Copy RunTest.bat into the directory where pcretest.exe has been created.
348 
349 2. Edit RunTest.bat and insert a line that indentifies the relative location of
350    the pcre source, e.g.:
351 
352    set srcdir=..\pcre-7.4-RC3
353 
354 3. Run RunTest.bat from a command shell environment. Test outputs will
355    automatically be compared to expected results, and discrepancies will
356    identified in the console output.
357 
358 4. To test pcrecpp, run pcrecpp_unittest.exe, pcre_stringpiece_unittest.exe and
359    pcre_scanner_unittest.exe.
360 
361 
362 BUILDING UNDER WINDOWS WITH BCC5.5
363 
364 Michael Roy sent these comments about building PCRE under Windows with BCC5.5:
365 
366   Some of the core BCC libraries have a version of PCRE from 1998 built in,
367   which can lead to pcre_exec() giving an erroneous PCRE_ERROR_NULL from a
368   version mismatch. I'm including an easy workaround below, if you'd like to
369   include it in the non-unix instructions:
370 
371   When linking a project with BCC5.5, pcre.lib must be included before any of
372   the libraries cw32.lib, cw32i.lib, cw32mt.lib, and cw32mti.lib on the command
373   line.
374 
375 
376 BUILDING UNDER WINDOWS CE WITH VISUAL STUDIO 200x
377 
378 Vincent Richomme sent a zip archive of files to help with this process. They
379 can be found in the file "pcre-vsbuild.zip" in the Contrib directory of the FTP
380 site.
381 
382 
383 BUILDING PCRE ON OPENVMS
384 
385 Dan Mooney sent the following comments about building PCRE on OpenVMS. They
386 relate to an older version of PCRE that used fewer source files, so the exact
387 commands will need changing. See the current list of source files above.
388 
389 "It was quite easy to compile and link the library. I don't have a formal
390 make file but the attached file [reproduced below] contains the OpenVMS DCL
391 commands I used to build the library. I had to add #define
392 POSIX_MALLOC_THRESHOLD 10 to pcre.h since it was not defined anywhere.
393 
394 The library was built on:
395 O/S: HP OpenVMS v7.3-1
396 Compiler: Compaq C v6.5-001-48BCD
397 Linker: vA13-01
398 
399 The test results did not match 100% due to the issues you mention in your
400 documentation regarding isprint(), iscntrl(), isgraph() and ispunct(). I
401 modified some of the character tables temporarily and was able to get the
402 results to match. Tests using the fr locale did not match since I don't have
403 that locale loaded. The study size was always reported to be 3 less than the
404 value in the standard test output files."
405 
406 =========================
407 $! This DCL procedure builds PCRE on OpenVMS
408 $!
409 $! I followed the instructions in the non-unix-use file in the distribution.
410 $!
411 $ COMPILE == "CC/LIST/NOMEMBER_ALIGNMENT/PREFIX_LIBRARY_ENTRIES=ALL_ENTRIES
412 $ COMPILE DFTABLES.C
413 $ LINK/EXE=DFTABLES.EXE DFTABLES.OBJ
414 $ RUN DFTABLES.EXE/OUTPUT=CHARTABLES.C
415 $ COMPILE MAKETABLES.C
416 $ COMPILE GET.C
417 $ COMPILE STUDY.C
418 $! I had to set POSIX_MALLOC_THRESHOLD to 10 in PCRE.H since the symbol
419 $! did not seem to be defined anywhere.
420 $! I edited pcre.h and added #DEFINE SUPPORT_UTF8 to enable UTF8 support.
421 $ COMPILE PCRE.C
422 $ LIB/CREATE PCRE MAKETABLES.OBJ, GET.OBJ, STUDY.OBJ, PCRE.OBJ
423 $! I had to set POSIX_MALLOC_THRESHOLD to 10 in PCRE.H since the symbol
424 $! did not seem to be defined anywhere.
425 $ COMPILE PCREPOSIX.C
426 $ LIB/CREATE PCREPOSIX PCREPOSIX.OBJ
427 $ COMPILE PCRETEST.C
428 $ LINK/EXE=PCRETEST.EXE PCRETEST.OBJ, PCRE/LIB, PCREPOSIX/LIB
429 $! C programs that want access to command line arguments must be
430 $! defined as a symbol
431 $ PCRETEST :== "$ SYS$ROADSUSERS:[DMOONEY.REGEXP]PCRETEST.EXE"
432 $! Arguments must be enclosed in quotes.
433 $ PCRETEST "-C"
434 $! Test results:
435 $!
436 $!   The test results did not match 100%. The functions isprint(), iscntrl(),
437 $!   isgraph() and ispunct() on OpenVMS must not produce the same results
438 $!   as the system that built the test output files provided with the
439 $!   distribution.
440 $!
441 $!   The study size did not match and was always 3 less on OpenVMS.
442 $!
443 $!   Locale could not be set to fr
444 $!
445 =========================
446 
447 Last Updated: 17 March 2009
448 ****

source navigation ]   [ diff markup ]   [ identifier search ]   [ freetext search ]   [ file search ]  

This page was automatically generated by the LXR engine.
Visit the LXR main site for more information.