view contrib/unicode2nginx/unicode-to-nginx.pl @ 6250:0256738454dc

Increased the default number of output buffers. Since an output buffer can only be used for either reading or sending, small amounts of data left from the previous operation (due to some limits) must be sent before nginx will be able to read further into the buffer. Using only one output buffer can result in suboptimal behavior that manifests itself in forming and sending too small chunks of data. This is particularly painful with SPDY (or HTTP/2) where each such chunk needs to be prefixed with some header. The default flow-control window in HTTP/2 is 64k minus one bytes. With one 32k output buffer this results is one byte left after exhausting the window. With two 32k buffers the data will be read into the second free buffer before sending, thus the minimum output is increased to 32k + 1 bytes which is much better.
author Valentin Bartenev <vbart@nginx.com>
date Tue, 15 Sep 2015 17:49:15 +0300
parents 63a820b0bc6c
children 8752257e883f
line wrap: on
line source

#!/usr/bin/perl -w

# Convert unicode mappings to nginx configuration file format.

# You may find useful mappings in various places, including
# unicode.org official site:
#
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT

# Needs perl 5.6 or later.

# Written by Maxim Dounin, mdounin@rambler-co.ru

###############################################################################

require 5.006;

while (<>) {
	# Skip comments and empty lines

	next if /^#/;
	next if /^\s*$/;
	chomp;

	# Convert mappings

	if (/^\s*0x(..)\s*0x(....)\s*(#.*)/) {
		# Mapping <from-code> <unicode-code> "#" <unicode-name>
		my $cs_code = $1;
		my $un_code = $2;
		my $un_name = $3;

		# Produce UTF-8 sequence from character code;

		my $un_utf8 = join('', map { sprintf("%02X", $_) } unpack("C*", pack("U", hex($un_code))));

		print "    $cs_code  $un_utf8 ; $un_name\n";

	} else {
		warn "Unrecognized line: '$_'";
	}
}

###############################################################################