view contrib/unicode2nginx/unicode-to-nginx.pl @ 6015:e11a8e7e8e0c

Configure: fixed type max value detection. The code tried to use suffixes for "long" and "long long" types, but it never worked as intended due to the bug in the shell code. Also, the max value for any 64-bit type other than "long long" on platforms with 32-bit "long" would be incorrect if the bug was fixed. So instead of fixing the bug in the shell code, always use the "int" constant for 32-bit types, and "long long" constant for 64-bit types.
author Ruslan Ermilov <ru@nginx.com>
date Wed, 18 Mar 2015 02:04:39 +0300
parents 63a820b0bc6c
children 8752257e883f
line wrap: on
line source

#!/usr/bin/perl -w

# Convert unicode mappings to nginx configuration file format.

# You may find useful mappings in various places, including
# unicode.org official site:
#
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT

# Needs perl 5.6 or later.

# Written by Maxim Dounin, mdounin@rambler-co.ru

###############################################################################

require 5.006;

while (<>) {
	# Skip comments and empty lines

	next if /^#/;
	next if /^\s*$/;
	chomp;

	# Convert mappings

	if (/^\s*0x(..)\s*0x(....)\s*(#.*)/) {
		# Mapping <from-code> <unicode-code> "#" <unicode-name>
		my $cs_code = $1;
		my $un_code = $2;
		my $un_name = $3;

		# Produce UTF-8 sequence from character code;

		my $un_utf8 = join('', map { sprintf("%02X", $_) } unpack("C*", pack("U", hex($un_code))));

		print "    $cs_code  $un_utf8 ; $un_name\n";

	} else {
		warn "Unrecognized line: '$_'";
	}
}

###############################################################################