view contrib/unicode2nginx/unicode-to-nginx.pl @ 5540:3a8e19528b30

SSI: fixed $date_local and $date_gmt without SSI (ticket #230). If there is no SSI context in a given request at a given time, the $date_local and $date_gmt variables used "%s" format, instead of "%A, %d-%b-%Y %H:%M:%S %Z" documented as the default and used if there is SSI module context and timefmt wasn't modified using the "config" SSI command. While use of these variables outside of the SSI evaluation isn't strictly valid, previous behaviour is certainly inconsistent, hence the fix.
author Maxim Dounin <mdounin@mdounin.ru>
date Tue, 28 Jan 2014 15:40:45 +0400
parents 63a820b0bc6c
children 8752257e883f
line wrap: on
line source

#!/usr/bin/perl -w

# Convert unicode mappings to nginx configuration file format.

# You may find useful mappings in various places, including
# unicode.org official site:
#
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT

# Needs perl 5.6 or later.

# Written by Maxim Dounin, mdounin@rambler-co.ru

###############################################################################

require 5.006;

while (<>) {
	# Skip comments and empty lines

	next if /^#/;
	next if /^\s*$/;
	chomp;

	# Convert mappings

	if (/^\s*0x(..)\s*0x(....)\s*(#.*)/) {
		# Mapping <from-code> <unicode-code> "#" <unicode-name>
		my $cs_code = $1;
		my $un_code = $2;
		my $un_name = $3;

		# Produce UTF-8 sequence from character code;

		my $un_utf8 = join('', map { sprintf("%02X", $_) } unpack("C*", pack("U", hex($un_code))));

		print "    $cs_code  $un_utf8 ; $un_name\n";

	} else {
		warn "Unrecognized line: '$_'";
	}
}

###############################################################################