view contrib/unicode2nginx/unicode-to-nginx.pl @ 8515:c860f0b7010c

Core: disabled cloning sockets when testing config (ticket #2188). Since we anyway do not set SO_REUSEPORT when testing configuration (see ecb5cd305b06), trying to open additional sockets does not make much sense, as all these additional sockets are expected to result in EADDRINUSE errors from bind(). On the other hand, there are reports that trying to open these sockets takes significant time under load: total configuration testing time greater than 15s was observed in ticket #2188, compared to less than 1s without load. With this change, no additional sockets are opened during testing configuration.
author Maxim Dounin <mdounin@mdounin.ru>
date Mon, 31 May 2021 16:36:37 +0300
parents 8752257e883f
children
line wrap: on
line source

#!/usr/bin/perl -w

# Convert unicode mappings to nginx configuration file format.

# You may find useful mappings in various places, including
# unicode.org official site:
#
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT

# Needs perl 5.6 or later.

# Written by Maxim Dounin, mdounin@mdounin.ru

###############################################################################

require 5.006;

while (<>) {
	# Skip comments and empty lines

	next if /^#/;
	next if /^\s*$/;
	chomp;

	# Convert mappings

	if (/^\s*0x(..)\s*0x(....)\s*(#.*)/) {
		# Mapping <from-code> <unicode-code> "#" <unicode-name>
		my $cs_code = $1;
		my $un_code = $2;
		my $un_name = $3;

		# Produce UTF-8 sequence from character code;

		my $un_utf8 = join('',
			map { sprintf("%02X", $_) }
			unpack("U0C*", pack("U", hex($un_code)))
		);

		print "    $cs_code  $un_utf8 ; $un_name\n";

	} else {
		warn "Unrecognized line: '$_'";
	}
}

###############################################################################