view contrib/unicode2nginx/unicode-to-nginx.pl @ 7441:8acaa1161783

Stream: do not split datagrams when limiting proxy rate. Previously, when using proxy_upload_rate and proxy_download_rate, the buffer size for reading from a socket could be reduced as a result of rate limiting. For connection-oriented protocols this behavior is normal since unread data will normally be read at the next iteration. But for datagram-oriented protocols this is not the case, and unread part of the datagram is lost. Now buffer size is not limited for datagrams. Rate limiting still works in this case by delaying the next reading event.
author Roman Arutyunyan <arut@nginx.com>
date Thu, 27 Dec 2018 19:37:34 +0300
parents 8752257e883f
children
line wrap: on
line source

#!/usr/bin/perl -w

# Convert unicode mappings to nginx configuration file format.

# You may find useful mappings in various places, including
# unicode.org official site:
#
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT

# Needs perl 5.6 or later.

# Written by Maxim Dounin, mdounin@mdounin.ru

###############################################################################

require 5.006;

while (<>) {
	# Skip comments and empty lines

	next if /^#/;
	next if /^\s*$/;
	chomp;

	# Convert mappings

	if (/^\s*0x(..)\s*0x(....)\s*(#.*)/) {
		# Mapping <from-code> <unicode-code> "#" <unicode-name>
		my $cs_code = $1;
		my $un_code = $2;
		my $un_name = $3;

		# Produce UTF-8 sequence from character code;

		my $un_utf8 = join('',
			map { sprintf("%02X", $_) }
			unpack("U0C*", pack("U", hex($un_code)))
		);

		print "    $cs_code  $un_utf8 ; $un_name\n";

	} else {
		warn "Unrecognized line: '$_'";
	}
}

###############################################################################