From: François Pinard Date: Wed, 12 Mar 2008 05:23:53 +0000 (-0400) Subject: Memory leaks in outer X-Git-Tag: v3.7~236 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=a679130e06a2f637a6fa0a4a92827ab5cf793aa7;p=recode Memory leaks in outer --- diff --git a/THANKS b/THANKS index 39099d0..deddce4 100644 --- a/THANKS +++ b/THANKS @@ -88,6 +88,7 @@ Frère Roy roy@taize.fr http://www.taize.fr Gabriel P. Silva gpsilva@geocities.com http://www.nce.ufrj.br/~gabriel +Gaël Le Mignot kilobug@freesurf.fr Ghislain Plamondon Georg Haefele haefele@atlas.gis.univie.ac.at Greg McGary diff --git a/doc/recode.info b/doc/recode.info index fd9b2f5..cc49414 100644 --- a/doc/recode.info +++ b/doc/recode.info @@ -1487,6 +1487,19 @@ his program (let's assume the programmer is a male, here, no prejudice intended). This `outer' variable is given as a first argument to all outer level functions. + The `RECODE_OUTER' structure is really meant to be initialised only +once in the life of a program, and terminated with the program itself. +Program interfaces should pay attention to initialise it only once, +would it be only for speed considerations. A good deal of overhead goes +to outer level initialization, and if the outer level was initialized +afresh for each and every string translated, say, the Recode library +would appear immensely much slower that it was meant to be! + + Because outer level initialization is meant to be done only once, +not so much attention has been paid to avoid memory leaks at this level +within Recode. This is hardly a reason for not plugging such leaks at +any level: in the long run, they should all be chased and repaired. + The `' header file uses the Boolean type setup by the system header file `'. But this header file is still fairly new in C standards, and likely does not exist everywhere. If you @@ -1618,10 +1631,11 @@ File: recode.info, Node: Request level, Next: Task level, Prev: Outer level, The request level functions are meant to cover most recoding needs programmers may have; they should provide all usual functionality. -Their API is almost stable by now. To get started with request level -functions, here is a full example of a program which sole job is to -filter `ibmpc' code on its standard input into `latin1' code on its -standard output. +Their API is almost stable by now. + + To get started with request level functions, here is a full example +of a program which sole job is to filter `ibmpc' code on its standard +input into `latin1' code on its standard output. #include #include @@ -1652,6 +1666,14 @@ which the programmer should use for allocating a variable in his program. This REQUEST variable is given as a first argument to all request level functions, and in most cases, may be considered as opaque. + Suppose an application is doing a lot of recoding using only a few +different requests. For speed considerations, the `RECODE_REQUEST' +structure should ideally be cached for each kind of request, so the +request level initialisation is not redone for each and every string +translated. The speedup should be more apparent when Recode is able to +optimize the work by building on the fly, within the structure, new +specialized recoding steps and their associated data tables. + * Initialisation functions RECODE_REQUEST recode_new_request (OUTER); @@ -5001,7 +5023,7 @@ Concept Index * ambiguous output, error message: Errors. (line 31) * ASCII table, recreating with Recode: ASCII. (line 12) * average number of recoding steps: Main flow. (line 40) -* bool data type: Outer level. (line 31) +* bool data type: Outer level. (line 44) * box-drawing characters: Recoding. (line 16) * bug reports, where to send: Contributing. (line 37) * byte order mark: UCS-2. (line 12) @@ -5080,8 +5102,8 @@ Concept Index * implied surfaces: Requests. (line 69) * impossible conversions: Charset overview. (line 33) * information about charsets: Listings. (line 153) -* initialisation functions, outer: Outer level. (line 84) -* initialisation functions, request: Request level. (line 42) +* initialisation functions, outer: Outer level. (line 97) +* initialisation functions, request: Request level. (line 51) * initialisation functions, task: Task level. (line 52) * interface, with iconv library: iconv. (line 6) * intermediate charsets: Requests. (line 23) @@ -5096,6 +5118,7 @@ Concept Index * LaTeX files: LaTeX. (line 6) * Latin charsets: ISO 8859. (line 6) * Latin-1 table, recreating with Recode: ISO 8859. (line 45) +* leaks, memory: Outer level. (line 39) * letter case, in charset and surface names: Requests. (line 93) * libiconv: iconv. (line 6) * library, iconv: iconv. (line 6) @@ -5104,6 +5127,7 @@ Concept Index * map filling: Reversibility. (line 98) * map filling, disable: Reversibility. (line 48) * markup language: HTML. (line 6) +* memory leaks: Outer level. (line 39) * memory sequencing: Sequencing. (line 23) * MIME encodings: MIME. (line 6) * misuse of recoding library, error message: Errors. (line 76) @@ -5122,7 +5146,7 @@ Concept Index * partial conversion: Mixed. (line 20) * permutations of groups of bytes: Permutations. (line 6) * pipe sequencing: Sequencing. (line 40) -* program_name variable: Outer level. (line 137) +* program_name variable: Outer level. (line 150) * programming language support: Listings. (line 26) * pseudo-charsets: Charset overview. (line 33) * pure charset: Surface overview. (line 17) @@ -5153,7 +5177,9 @@ Concept Index * silent operation: Reversibility. (line 36) * single step: Main flow. (line 17) * source file generation: Listings. (line 26) -* stdbool.h header: Outer level. (line 31) +* speed considerations <1>: Request level. (line 43) +* speed considerations: Outer level. (line 31) +* stdbool.h header: Outer level. (line 44) * strict operation: Reversibility. (line 48) * string and comments conversion: Mixed. (line 39) * structural surfaces: Surfaces. (line 44) @@ -5255,12 +5281,12 @@ and variables in the Recode library. * Menu: * abort_level: Task level. (line 198) -* ascii_graphics: Request level. (line 101) +* ascii_graphics: Request level. (line 110) * byte_order_mark: Task level. (line 182) * declare_step: New surfaces. (line 13) * DEFAULT_CHARSET: Requests. (line 104) -* diacritics_only: Request level. (line 92) -* diaeresis_char: Request level. (line 76) +* diacritics_only: Request level. (line 101) +* diaeresis_char: Request level. (line 85) * error_so_far: Task level. (line 210) * fail_level: Task level. (line 188) * file_one_to_many: New charsets. (line 70) @@ -5271,48 +5297,48 @@ and variables in the Recode library. * list_all_charsets: Charset level. (line 15) * list_concise_charset: Charset level. (line 15) * list_full_charset: Charset level. (line 15) -* make_header_flag: Request level. (line 83) +* make_header_flag: Request level. (line 92) * RECODE_AMBIGUOUS_OUTPUT: Errors. (line 31) -* recode_buffer_to_buffer: Request level. (line 147) -* recode_buffer_to_file: Request level. (line 147) -* recode_delete_outer: Outer level. (line 89) -* recode_delete_request: Request level. (line 47) +* recode_buffer_to_buffer: Request level. (line 156) +* recode_buffer_to_file: Request level. (line 156) +* recode_delete_outer: Outer level. (line 102) +* recode_delete_request: Request level. (line 56) * recode_delete_task: Task level. (line 54) -* recode_file_to_buffer: Request level. (line 147) -* recode_file_to_file: Request level. (line 147) +* recode_file_to_buffer: Request level. (line 156) +* recode_file_to_file: Request level. (line 156) * recode_filter_close: Task level. (line 217) -* recode_filter_close, not available: Request level. (line 212) +* recode_filter_close, not available: Request level. (line 221) * recode_filter_open: Task level. (line 217) -* recode_filter_open, not available: Request level. (line 212) -* recode_format_table: Request level. (line 227) +* recode_filter_open, not available: Request level. (line 221) +* recode_format_table: Request level. (line 236) * RECODE_INTERNAL_ERROR: Errors. (line 81) * RECODE_INVALID_INPUT: Errors. (line 61) * RECODE_MAXIMUM_ERROR <1>: Errors. (line 88) * RECODE_MAXIMUM_ERROR: Task level. (line 198) -* recode_new_outer: Outer level. (line 89) -* recode_new_request: Request level. (line 47) +* recode_new_outer: Outer level. (line 102) +* recode_new_request: Request level. (line 56) * recode_new_task: Task level. (line 54) * RECODE_NO_ERROR: Errors. (line 16) * RECODE_NOT_CANONICAL: Errors. (line 19) * RECODE_OUTER structure: Outer level. (line 25) * recode_perform_task: Task level. (line 217) -* recode_request structure: Request level. (line 59) -* RECODE_REQUEST structure: Request level. (line 37) -* recode_scan_request: Request level. (line 111) +* recode_request structure: Request level. (line 68) +* RECODE_REQUEST structure: Request level. (line 38) +* recode_scan_request: Request level. (line 120) * RECODE_SEQUENCE_IN_MEMORY: Task level. (line 169) * RECODE_SEQUENCE_WITH_FILES: Task level. (line 172) * RECODE_SEQUENCE_WITH_PIPE: Task level. (line 175) * RECODE_STRATEGY_UNDECIDED: Task level. (line 162) -* recode_string: Request level. (line 140) -* recode_string_to_buffer: Request level. (line 147) -* recode_string_to_file: Request level. (line 147) +* recode_string: Request level. (line 149) +* recode_string_to_buffer: Request level. (line 156) +* recode_string_to_file: Request level. (line 156) * RECODE_SYSTEM_ERROR: Errors. (line 71) * RECODE_TASK structure: Task level. (line 46) * RECODE_UNTRANSLATABLE: Errors. (line 50) * RECODE_USER_ERROR: Errors. (line 76) * strategy: Task level. (line 162) * task_request structure: Task level. (line 81) -* verbose_flag: Request level. (line 71) +* verbose_flag: Request level. (line 80)  File: recode.info, Node: Charset and Surface Index, Prev: Library Index, Up: Top @@ -6081,71 +6107,71 @@ Node: Emacs58661 Node: Debugging59695 Node: Library63965 Node: Outer level65319 -Node: Request level71429 -Node: Task level81896 -Node: Charset level92318 -Node: Errors93160 -Ref: Errors-Footnote-198006 -Ref: Errors-Footnote-298120 -Node: Universal98481 -Ref: Universal-Footnote-1101593 -Ref: Universal-Footnote-2101659 -Node: UCS-2101872 -Node: UCS-4104398 -Node: UTF-7104938 -Node: UTF-8105533 -Node: UTF-16109838 -Node: count-characters110986 -Node: dump-with-names111657 -Node: iconv114206 -Node: Tabular117637 -Node: ASCII misc139850 -Node: ASCII140216 -Node: ISO 8859141032 -Node: ASCII-BS143326 -Node: flat145163 -Node: IBM and MS145834 -Node: EBCDIC146378 -Node: IBM-PC148474 -Ref: IBM-PC-Footnote-1150588 -Node: Icon-QNX150747 -Node: CDC151172 -Node: Display Code152853 -Ref: Display Code-Footnote-1155134 -Node: CDC-NOS155339 -Node: Bang-Bang157301 -Node: Micros159230 -Node: Apple-Mac159613 -Node: AtariST161647 -Node: Miscellaneous162633 -Node: HTML163366 -Node: LaTeX169355 -Node: Texinfo170129 -Node: Vietnamese170901 -Node: African171877 -Node: Others173227 -Node: Texte174681 -Ref: Texte-Footnote-1179231 -Ref: Texte-Footnote-2179311 -Ref: Texte-Footnote-3179786 -Node: Mule179883 -Ref: Mule-Footnote-1181664 -Node: Surfaces182183 -Ref: Surfaces-Footnote-1185602 -Node: Permutations185706 -Node: End lines186547 -Node: MIME188748 -Node: Dump189935 -Node: Test194105 -Node: Internals196583 -Node: Main flow197811 -Node: New charsets200914 -Node: New surfaces205452 -Node: Design206178 -Ref: Design-Footnote-1215344 -Node: Concept Index215448 -Node: Option Index230191 -Node: Library Index233044 -Node: Charset and Surface Index237619 +Node: Request level72193 +Node: Task level83140 +Node: Charset level93562 +Node: Errors94404 +Ref: Errors-Footnote-199250 +Ref: Errors-Footnote-299364 +Node: Universal99725 +Ref: Universal-Footnote-1102837 +Ref: Universal-Footnote-2102903 +Node: UCS-2103116 +Node: UCS-4105642 +Node: UTF-7106182 +Node: UTF-8106777 +Node: UTF-16111082 +Node: count-characters112230 +Node: dump-with-names112901 +Node: iconv115450 +Node: Tabular118881 +Node: ASCII misc141094 +Node: ASCII141460 +Node: ISO 8859142276 +Node: ASCII-BS144570 +Node: flat146407 +Node: IBM and MS147078 +Node: EBCDIC147622 +Node: IBM-PC149718 +Ref: IBM-PC-Footnote-1151832 +Node: Icon-QNX151991 +Node: CDC152416 +Node: Display Code154097 +Ref: Display Code-Footnote-1156378 +Node: CDC-NOS156583 +Node: Bang-Bang158545 +Node: Micros160474 +Node: Apple-Mac160857 +Node: AtariST162891 +Node: Miscellaneous163877 +Node: HTML164610 +Node: LaTeX170599 +Node: Texinfo171373 +Node: Vietnamese172145 +Node: African173121 +Node: Others174471 +Node: Texte175925 +Ref: Texte-Footnote-1180475 +Ref: Texte-Footnote-2180555 +Ref: Texte-Footnote-3181030 +Node: Mule181127 +Ref: Mule-Footnote-1182908 +Node: Surfaces183427 +Ref: Surfaces-Footnote-1186846 +Node: Permutations186950 +Node: End lines187791 +Node: MIME189992 +Node: Dump191179 +Node: Test195349 +Node: Internals197827 +Node: Main flow199055 +Node: New charsets202158 +Node: New surfaces206696 +Node: Design207422 +Ref: Design-Footnote-1216588 +Node: Concept Index216692 +Node: Option Index231727 +Node: Library Index234580 +Node: Charset and Surface Index239155  End Tag Table diff --git a/doc/recode.texi b/doc/recode.texi index 0464332..0fe7eec 100644 --- a/doc/recode.texi +++ b/doc/recode.texi @@ -1725,6 +1725,7 @@ at outer level, and then, various functions at request level. @section Outer level functions @cindex outer level functions + The outer level functions mainly prepare the whole recoding library for use, or do actions which are unrelated to specific recodings. Here is an example of a program which does not really make anything useful. @@ -1755,6 +1756,22 @@ his program (let's assume the programmer is a male, here, no prejudice intended). This @samp{outer} variable is given as a first argument to all outer level functions. +@cindex speed considerations +The @code{RECODE_OUTER} structure is really meant to be initialised only +once in the life of a program, and terminated with the program itself. +Program interfaces should pay attention to initialise it only once, +would it be only for speed considerations. A good deal of overhead goes +to outer level initialization, and if the outer level was initialized +afresh for each and every string translated, say, the Recode library +would appear immensely much slower that it was meant to be! + +@cindex memory leaks +@cindex leaks, memory +Because outer level initialization is meant to be done only once, not so +much attention has been paid to avoid memory leaks at this level within +Recode. This is hardly a reason for not plugging such leaks at any +level: in the long run, they should all be chased and repaired. + @cindex @code{stdbool.h} header @cindex @code{bool} data type The @code{} header file uses the Boolean type setup by the @@ -1899,10 +1916,11 @@ may be used by the library @emph{when} the user sets it to diagnose itself. @cindex request level functions The request level functions are meant to cover most recoding needs programmers may have; they should provide all usual functionality. -Their API is almost stable by now. To get started with request level -functions, here is a full example of a program which sole job is to filter -@code{ibmpc} code on its standard input into @code{latin1} code on its -standard output. +Their API is almost stable by now. + +To get started with request level functions, here is a full example of +a program which sole job is to filter @code{ibmpc} code on its standard +input into @code{latin1} code on its standard output. @example @group @@ -1938,6 +1956,15 @@ which the programmer should use for allocating a variable in his program. This @var{request} variable is given as a first argument to all request level functions, and in most cases, may be considered as opaque. +@cindex speed considerations +Suppose an application is doing a lot of recoding using only a few +different requests. For speed considerations, the @code{RECODE_REQUEST} +structure should ideally be cached for each kind of request, so the +request level initialisation is not redone for each and every string +translated. The speedup should be more apparent when Recode is able +to optimize the work by building on the fly, within the structure, new +specialized recoding steps and their associated data tables. + @itemize @bullet @item Initialisation functions @cindex initialisation functions, request diff --git a/src/ChangeLog b/src/ChangeLog index 944143e..b70d2f1 100644 --- a/src/ChangeLog +++ b/src/ChangeLog @@ -1,3 +1,11 @@ +2008-03-12 Andreas Schwab + + * names.c (alias_free): New function. + (prepare_for_aliases): Pass it to hash_initialize. + * outer.c (recode_delete_outer): Free elements of + argmatch_charset_array and argmatch_surface_array. + Also reported by Gaël Le Mignot and Pawel Krawczyk. + 2008-03-11 François Pinard * recodext.h (struct recode_symbol): Add iconv_name. @@ -267,7 +275,7 @@ * recode.c (init_ucs2_to_byte): Free hash table after use. * testdump.c (produce_count): Free hash table entries and the table itself after use. Fix clean-up code. - + 2000-06-28 François Pinard * task.c: Insert union wait portability from GNU make. diff --git a/src/names.c b/src/names.c index 4b33a4f..6e3e86a 100644 --- a/src/names.c +++ b/src/names.c @@ -101,6 +101,20 @@ alias_comparator (const void *void_first, const void *void_second) return strcmp (first->name, second->name) == 0; } +static void +alias_free (void *void_alias) +{ + RECODE_ALIAS alias = void_alias; + struct recode_surface_list *list, *next; + + for (list = alias->implied_surfaces; list; list = next) + { + next = list->next; + free (list); + } + free (alias); +} + bool prepare_for_aliases (RECODE_OUTER outer) { @@ -108,7 +122,7 @@ prepare_for_aliases (RECODE_OUTER outer) outer->number_of_symbols = 0; outer->alias_table - = hash_initialize (800, NULL, alias_hasher, alias_comparator, free); + = hash_initialize (800, NULL, alias_hasher, alias_comparator, alias_free); if (!outer->alias_table) return false; diff --git a/src/outer.c b/src/outer.c index 1854d76..1d95d93 100644 --- a/src/outer.c +++ b/src/outer.c @@ -602,9 +602,6 @@ bool recode_delete_outer (RECODE_OUTER outer) { unregister_all_modules (outer); - /* FIXME: Pawel Krawczyk reports that calling new_outer ... delete_outer - 20000 times in a program has the effect of consuming all virtual memory. - So there might be memory leaks should to track down and resolve. */ while (outer->number_of_symbols > 0) { RECODE_SYMBOL symbol = outer->symbol_list; @@ -626,7 +623,15 @@ recode_delete_outer (RECODE_OUTER outer) if (outer->alias_table) hash_free (outer->alias_table); if (outer->argmatch_charset_array) - free (outer->argmatch_charset_array); + { + char **cursor; + + for (cursor = outer->argmatch_charset_array; *cursor; cursor++) + free (*cursor); + for (cursor = outer->argmatch_surface_array; *cursor; cursor++) + free (*cursor); + free (outer->argmatch_charset_array); + } if (outer->one_to_same) free ((void *) outer->one_to_same); free (outer);