Functions
Replace Pioneer - text/binary/web file - Batch search,replace,convert,rename,split,download    System_Variables  Pattern_Builder_ 

 

Supporting System Functions and most Perl Functions.

 

 

When user types a part of functions and parameters in "Replace with Pattern" window, the tips of the function syntax will be poped-up below the insert cursor. The tips info will change to 'red' if parameter number exceed the allowed number.

 

 

 

Following System Functions are supported:

 

 

 

 

*** Type of "String" ***

 

 

words(text,start_no|range1[,end_no|range2,[input_delimiter,[output_delimiter,[column_width1,column_width2,...]]])

 

=> return the "start_no" word to "end_no" word of text

 

Note 1: if "end_no" is abscent, the function will only return "start_no"

 

Note 2: if "start_no","end_no" is a minus value, it means counting from bottom of string

 

Note 3: if "start_no" or "end_no" contains ",<>!." characters, such as "5,6,8..10", it will be explained as flexible range, please check Range Definition for the detail of range

 

Note 4: Note 2 and 3 also apply to functions lines/chars/words_r/lines_r/chars_r

 

Note 5: user can set input_delimiter and output_deflimiter, if set to undef or abscent, system will use values in user settings.

 

Note 6: If column_width1,... are defined, input_delimiter will be ignored, words will be splitted by column width, negative width means bytes to skip.

 

lines(text,start_no|range1[,end_no|range2])

 

=> return the "start_no" line to "end_no" line, start_no/end_no can also be format of range like "5,6,8..10"

 

chars(text,start_no|range1[,end_no|range2])

 

=> return the "start_no" char to "end_no" char, start_no/end_no can also be format of range like "5,6,8..10"

 

words_r(text,start_no|range1[,end_no|range2,[input_delimiter,[output_delimiter,[column_width1,column_width2,...]]]])

 

=> return the "end_no" word to "start_no" word, start_no/end_no can also be format of range like "5,6,8..10"

 

lines_r(text,start_no|range1[,end_no|range2])

 

=> return the "end_no" line to "start_no" line, start_no/end_no can also be format of range like "5,6,8..10"

 

chars_r(text,start_no|range1[,end_no|range2])

 

=> return the "end_no" char to "start_no" char, start_no/end_no can also be format of range like "5,6,8..10"

 

page(page_no,[start_line|line_range1, [end_line|line_range2]])

 

=> get some of lines from pages page_no(P0,P1,...P15), if lines are ignored, all text on the page will be returned, if start_line is #, the number of lines of the page will be returned.

 

 

replace(text,pattern[,newtxt,[flag:0=replace(default),1=extract,2=translate]])

 

=> will find pattern in text, and replace with newtxt, if newtext is abscent, just remove the pattern in text, if flag=1 or 'extract', only keep replaced text, if flag=2 or 'translate', will make letter-to-letter translation.

 

 

sort_by_word(text,start_no|range1[,end_no|range][,'num|date'][,'uniq'][,'ic'][,'desc'])

 

=> sort a multi-line text by the key from word "start_no" to word "end_no", start_no/end_no can also be format of range like "5,6,8..10"

 

Note 1: if "end_no" is abcent, system will sort the lines by the key from "start_no" to the last word.

 

Note 2: 'num|date' means sort as number/date, 'uniq' means remove duplicate lines, 'ic' means igore cases, 'desc' means in reverse order.

 

Note 3: Note 1 and Note 2 also apply to function sort_by_char

 

sort_by_char(text,start_no|range[,end_no|range][,'num|date'][,'uniq'][,'ic'][,'desc'])

 

=> sort a multi-line text by the key from char "start_no" to char "end_no", start_no/end_no can also be format of range like "5,6,8..10"

 

sort_by_func(text,'function_name'|'{function body}',[,'num'][,'uniq'][,'ic'][,'desc'])

 

=> sort a multi-line text by the execution result of function on each line

 

you can use pre-defined function_name as well as anonymous function body enclosed by a pair of {}

 

the text of each line will be passed to function as first parameter

 

for example, sort_by_func($match,'length','num') will call the length() to get the length of each line, and sort by calculated length of each line

 

 

transpose(text,input_delimiter,output_delimiter,col_width1, col_width2, ...)

 

=> Swap the Column and Row of a text matrix, user can set input and output delimiter

 

If col_width1,col_width2,... is defined,columns will be seperated by fixed widths, negative width means bytes to skip between columns.

 

Example: transpose($match,undef,undef,3,-1,3,-1,5,-1) with swap column and row, separating columns by 3,(1),3,(1),5,(1),5,(1)....:

 

 

Sample Text:

 

 

abcdefghijklmnopqrstuvwxyz

 

abcdefghijklmnopqrstuvwxyz

 

abcdefghijklmnopqrstuvwxyz

 

 

separate by column ==>

 

 

abc efg ijklm opqrs uvwxy

 

abc efg ijklm opqrs uvwxy

 

abc efg ijklm opqrs uvwxy

 

 

swap column and row ==>

 

 

abc abc abc

 

efg efg efg

 

ijklm ijklm ijklm

 

opqrs opqrs opqrs

 

uvwxy uvwxy uvwxy

 

 

Note 1: if input_delimiter/output_delimiter/col_width are ignored or undef, system will use default word input/output delimiter in 'settings'.

 

Note 2: if last defined col_width is gap(<0), both column width and gap will be repeated for all columns follows, for instance,(3,-1,4)=>(3,-1,4,4,4,...), but (3,4,-1)=>(3,4,-1,4,-1,4,-1,...),

 

 

matches(text,pattern[,'ic'])

 

=> return True if text matches pattern, return False if text does not match pattern

 

Note 1: 'ic' means ignore cases.

 

Note 2: This function is mainly used in Filter 3 -- Match Condition Filter

 

rand_str(N,str1,str2,....strM,["d=delimiter|array" ])

 

=> Select N elements from str1 to strM randomly, and join them to a string, string will be joint by <CR> if N is negative. If the last parameter is "d=x", the string will be joint by x; If the last parameter is "d=array", the output will be an array;

 

rand_str_unique(N,str1,str2,....strM,["d=delimiter|array" ])

 

=> Select N distinct elements from str1 to strM randomly, and join them to a string, string will be joint by <CR> if N is negative. If the last parameter is "d=x", the string will be joint by x; If the last parameter is "d=array", the output will be an array;

 

 

make_batch(text,count[, start1, step1, stop1, fmt1, start2, step2, stop2, fmt2, start3, step3, stop3, fmt3])

 

=> generate lines of 'text', the total lines equal to 'count'.

 

Example: make_batch("test batch id1=${batch_no1} id2=${batch_no2}",10, 2,1,5,"d", 6,2,20,"d")

 

Will generate:

 

test batch id1=2 id2=6

 

test batch id1=3 id2=8

 

test batch id1=4 id2=10

 

test batch id1=5 id2=12

 

test batch id1=2 id2=14

 

test batch id1=3 id2=16

 

test batch id1=4 id2=18

 

test batch id1=5 id2=20

 

test batch id1=2 id2=6

 

test batch id1=3 id2=8

 

 

Format fmt1,fmt2,fmt3 Examples --

 

"d" -- "123"

 

"10d" -- "       123"

 

"-10d" -- "123       "

 

"010d" -- "0000000123"

 

 

There is a "File->Load->Text generator" menu supporting interactive settings for make_batch function.

 

 

make_batch1(text,count, start1, step1, stop1, fmt1, start2, step2, stop2, fmt2, start3, step3, stop3, fmt3)

 

=> generate lines of 'text', the total lines equal to 'count', batch_no1/2/3 are related.

 

 

Example: make_batch1("test batch1 id1=${batch_no1} id2=${batch_no2}",10, 2,1,5,"d", 6,2,20,"d")

 

Will generate:

 

test batch1 id1=2 id2=6

 

test batch1 id1=2 id2=8

 

test batch1 id1=2 id2=10

 

test batch1 id1=2 id2=12

 

test batch1 id1=2 id2=14

 

test batch1 id1=2 id2=16

 

test batch1 id1=2 id2=18

 

test batch1 id1=2 id2=20

 

test batch1 id1=3 id2=6

 

test batch1 id1=3 id2=8

 

 

Format fmt1,fmt2,fmt3 Examples --

 

"d" or "" -- "123"

 

"10d" or "10" -- "       123"

 

"-10d" or "-10" -- "123       "

 

"010d" or "010" -- "0000000123"

 

 

There is a "File->Load->Text generator" menu supporting interactive settings for make_batch1 function.

 

 

*** Type of "Maths" ***

 

 

calc(expression[,format])

 

=> return the value of the maths expression, example: calc(2*$match)

 

Format Examples --

 

"8.2f" -- "    5.67"

 

"10d" or "10" -- "       123"

 

"010d" or "010" -- "0000000123"

 

"-10d" or "-10" -- "123       "

 

 

count(text,pattern)

 

=> return the number of the pattern found in the text, example: count($match,'abc')

 

 

max(n1,n2,n3,...)

 

=> Return maximum values of number n1,n2,n3 ...

 

min(n1,n2,n3,...)

 

=> Return minumum values of n1,n2,n3 ...

 

add(n1,n2,n3,...)

 

=> Calculate the sum of all values

 

byte2num(bytes,[order:1=little-endian])

 

=>Convert bytes to number, typically used in "Fast Replace" with binary mode

 

num2byte(number,[order:1=little-endian],[length])

 

=>Convert number to bytes, length decide number of bytes returned, typically used in "Fast Replace" with binary mode

 

byte_add(bytes,number,[order:1=little-endian],[length])

 

=>Add to the binary bytes by value of number, length decide number of bytes returned, typically used in "Fast Replace" with binary mode

 

 

sys_encode(string,[encoding])

 

=>If 2nd parameter is absent, return string that encoded with $sys_encoding, otherwise use user specified encoding type

 

sys_decode(string,[encoding])

 

=>If 2nd parameter is absent, return string that decoded with $sys_encoding, otherwise use user specified encoding type

 

 

md5sum(filename)

 

=>Calculate md5sum of the file

 

sha1sum(filename)

 

=>Calculate sha1sum of the file

 

 

*** Type of "Dictionary" ***

 

 

get_value_num()

 

=> get the number of values in the dictionary

 

get_value($key)

 

=> get the value of $key in the dictionary

 

set_value($key,$value)

 

=> set the $key=$value in the dictionary

 

clear_value($key)

 

=> remove the definition of $key in the dictionary

 

clear_values_all()

 

=> remove all terms in the dictionary

 

get_values_all(format)

 

=> return all the key value pair in the dictionary with format "Key<separator>Value", by default, it is a table key.

 

if format is specified, the output will be the specified format, e.g. '$key, $value',

 

import_values(file,format)

 

=>import values into dictionary from file with specified format, format is the same as dictionary import operation, the default value is 'key'.

 

 

Note: the group of above functions are used in dictionary operation, the dictionary data can also be import, export and clear manually, for detail please see 'Usage of Dictionary' part

 

 

*** Type of "File" ***

 

 

file_time(filename,format,path)

 

=> Get file last modify time, format and path can be ignored.

 

format is the same as formattime function.

 

Example:file_time("a.txt","%Y-%m-%d","c:\\test");

 

 

file_line(filename,line,path,max_bytes_read,flags)

 

=> Extract specified line from file, path, max_bytes_read and flags can be ignored; will read whole file if max_bytes_read (default=0(maximum)) bytes fail to match; flags can be 'binary','text','nametext(default)' or '(encode_type_x)'.

 

Example:file_line("a.txt",3,"c:\\test") will get the 3rd line text from c:\test\a.txt;

 

 

file_content(filename,pattern,path,max_bytes_read,flags)

 

=> Extract the 1st pattern memory of specified pattern from file, the whole file content will be returned if pattern is ignored;

 

path, max_bytes_read and flags can also be ignored; it will read whole file if the first max_bytes_read (default=0(maximum)) bytes do not match the pattern;

 

flags can be 'binary','text(default)', 'nametext' or '(encode_type_x)', if type is a single char, the non-printable chars will be replaced with this char'.

 

filename can also be web address like 'http://www.mind-pioneer.com/' or 'http://www.mind-pioneer.com/,html', ',html' means download as html format.

 

filename can also be dos command like !dir

 

Example:file_content("a.txt","begin(.*)end","c:\\test",0,'text') will get the first text between "begin" and "end" from c:\test\a.txt;

 

Example:file_content("http://www.mind-pioneer.com/,html","<title>(.*?)</title>",'','','name_text') will get title of webpage http://www.mind-pioneer.com

 

Example:file_content("!dir","",'c:\\temp') will list all files in c:\\temp

 

 

mp3_info(filename,format,path)

 

=> Get the mp3 file info, path can be ignored, format can be a string containing %A(album) %R(artist) %T(track), default is '%R-%A', format can also be 'track', 'artist', or 'album'.

 

Example:mp3_info("a.mp3","%R-%A") will get the artist and album of the file a.mp3 with format of artist-album.

 

 

img_info(filename,format,path)

 

=> Return info of image file, path can be ignored, format can be a string containing %W(Width) %H(height), default is '%Wx%H', format can also be 'width', 'height';

 

Example:img_info("a.jpg","%Wx%H") will get width and heigh of the file a.jpg with format like 1024x768

 

 

html_title(filename,path,max_bytes_read)

 

=> Extract html title from file, path and max_bytes_read can be ignored.

 

max_bytes_read is max number of bytes to read in html files to find html title, default value is 1000. User can set it larger if the title is beyond 1000 bytes from start of file.

 

Example:file_time("a.html","c:\\test",2000);

 

 

*** Type of "Time" ***

 

 

current_date()

 

=> return current date with format YYYY-MM-DD

 

current_time()

 

=> return current time with format HH:MM:SS

 

date_time()

 

=> return current date-time with format YYYY-MM-DD HH:MM:SS

 

 

formattime(date_string, new_format, +/-number_of_seconds)

 

=> Will convert the date time to a new format, timezone, or add some days.

 

Example:formattime("21/Jan/2008:15:12:00","%y-%m-%d %H:%M:%S")=>08-01-21 15:12:00

 

Example:formattime("21/Jan/2008:15:12:00","%y-%m-%d %H:%M:%S",5*24*3600)=>08-01-26 15:12:00(add 5 days)

 

 

Supported date_strings:

 

 

"Wed, 09 Feb 2008 22:23:32 GMT" -- HTTP format

 

"Thu Feb 3 17:03:55 GMT 2008" -- ctime(3) format

 

"Thu Feb 3 00:00:00 2008", -- ANSI C asctime() format

 

"Tuesday, 08-Feb-08 14:15:29 GMT" -- old rfc850 HTTP format

 

"Tuesday, 08-Feb-2008 14:15:29 GMT" -- broken rfc850 HTTP format

 

 

"03/Feb/2008:17:03:55 -0700" -- common logfile format

 

"09 Feb 2008 22:23:32 GMT" -- HTTP format (no weekday)

 

"08-Feb-08 14:15:29 GMT" -- rfc850 format (no weekday)

 

"08-Feb-2008 14:15:29 GMT" -- broken rfc850 format (no weekday)

 

 

"2008-02-03 14:15:29 -0100" -- ISO 8601 format

 

"2008-02-03 14:15:29" -- zone is optional

 

"2008-02-03" -- only date

 

"2008-02-03T14:15:29" -- Use T as separator

 

"20080203T141529Z" -- ISO 8601 compact format

 

"20080203" -- only date

 

 

"08-Feb-08" -- old rfc850 HTTP format (no weekday, no time)

 

"08-Feb-2008" -- broken rfc850 HTTP format (no weekday, no time)

 

"09 Feb 2008" -- proposed new HTTP format (no weekday, no time)

 

"03/Feb/2008" -- common logfile format (no time, no offset)

 

 

"Feb 3 2008" -- Unix 'ls -l' format

 

"Feb 3 17:03" -- Unix 'ls -l' format

 

 

"11-15-08 03:52PM" -- Windows 'dir' format

 

 

Supported new date time format:

 

 

%% PERCENT

 

%a day of the week abbr

 

%A day of the week

 

%b month abbr

 

%B month

 

%c MM/DD/YY HH:MM:SS

 

%C ctime format: Sat Nov 19 21:05:57 1994

 

%d numeric day of the month, with leading zeros (eg 01..31)

 

%e numeric day of the month, without leading zeros (eg 1..31)

 

%D MM/DD/YY

 

%G GPS week number (weeks since January 6, 1980)

 

%h month abbr

 

%H hour, 24 hour clock, leading 0's)

 

%I hour, 12 hour clock, leading 0's)

 

%j day of the year

 

%k hour

 

%l hour, 12 hour clock

 

%L month number, starting with 1

 

%m month number, starting with 01

 

%M minute, leading 0's

 

%n NEWLINE

 

%o ornate day of month -- "1st", "2nd", "25th", etc.

 

%p AM or PM

 

%P am or pm (Yes %p and %P are backwards :)

 

%q Quarter number, starting with 1

 

%r time format: 09:05:57 PM

 

%R time format: 21:05

 

%s seconds since the Epoch, UCT

 

%S seconds, leading 0's

 

%t TAB

 

%T time format: 21:05:57

 

%U week number, Sunday as first day of week

 

%w day of the week, numerically, Sunday == 0

 

%W week number, Monday as first day of week

 

%x date format: 11/19/94

 

%X time format: 21:05:57

 

%y year (2 digits)

 

%Y year (4 digits)

 

%Z timezone in ascii. eg: PST

 

%z timezone in format -/+0000

 

 

formattime_gm(date_string, new_format, +/-number_of_seconds)

 

=> Will convert the date time to a new format with GM date time.

 

Example:formattime_gm("21/Jan/2008:15:12:00","%y-%m-%d %H:%M:%S")=>08-01-21 20:12:00

 

 

difftime(date_string1, date_string2)

 

=> Will return the number of seconds between 2 date time.

 

Example:difftime("21/Jan/2008","1970/1/1");

 

 

Type of <font color=blue>"Conversion"</font>:

 

 

html_to_txt($html_strings)

 

=> input a html_strings, return a text string without html tags

 

 

html_to_link($html_strings[,$base,$showtype,$typefilter])

 

=> input a html_strings, return all the links in the html_string

 

Note: the relative links will be converted to absolute by adding '$base'.

 

will show link type if showtype is 'Y' or 'y', typefilter can be set to 'A' or 'IMG' or 'A,IMG', etc.

 

 

correct(text)

 

=> Will return the correct value of 'text' if 'text' is mistyped.

 

Example: correct("repalce") will return "replace"

 

 

ip2country(ipaddress)

 

=> Will convert the ipaddress to corresponding country name

 

Example: ip2country("55.66.77.88") will return "United States"

 

 

ip2countryid(ipaddress)

 

=> Will convert the ipaddress to corresponding country name

 

Example: ip2country("55.66.77.88") will return "US"

 

 

*** Type of "System" (all Perl Functions, Libs) ***

 

 

uc(text)

 

=>Change text to upper case

 

lc(text)

 

=>Change text to lower case

 

ucfirst(text)

 

=>Change the first character of text to upper case

 

lcfirst(text)

 

=>Change the first character of text to lower case

 

index(string,substring,position)

 

=>Returns the position of the first occurrence of substring in string at or after position.

 

If you don't specify position, the search starts at the beginning of string.

 

rindex(string,substring,position)

 

=>Returns the position of the last occurrence of substring in string at or before position.

 

If you don't specify position, the search starts at the end of string.

 

substr(string,offset,length)

 

=>Returns a portion of string as determined by the offset(0 from leftmost,-1 from rightmost) and length parameters

 

chr(number)

 

=>Returns the character represented by number in the ASCII table. For instance, chr(65)=A

 

ord(string)

 

=>Returns the ascii value of the first character of string. For instance, ord("ABC")=65

 

length(string)

 

=>Returns the length of string

 

abs(value)

 

=>Returns absolute value.

 

int(value)

 

=>Convert the value to an integer and truncate it if necessary.

 

rand(number)

 

=>Return a random real number between 0 and the number. If no value is passed, 1 is assumed and a value between 0 and 1 is returned.

 

sqrt(value)

 

=>Returns the square root of the value.

 

time()

 

=>Returns the number of seconds since January 1, 1970.

 

pack(format,data1,data2,...)

 

=>Returns packed bytes of data1,data2,... with specified format.

 

unpack(format,data)

 

=>Returns unpacked values from data with specified format.

 

join(delimiter,str1,str2...)

 

=>Returns a string that is joint with str1, str2, ... , separated by "delimiter".

 

split(delimiter,string)

 

=>Returns an array of strings that splitted from "string", separated by "delimiter".

 

md5_hex(string)

 

=>Calculate md5sum of the string

 

sha1_hex(string)

 

=>Calculate sha1sum of the string

 

encode(encode,string)

 

=>Returns the string encoded with "encode"

 

decode(encode,string)

 

=>Returns the string decoded with "encode"

 

 

Examples:

 

length($match) will return the number of characters of $match

 

uc($match) will return upper case of string $match

 

lc($match) will return lower case of string $match

 

ucfirst("replace") will return "Replace"

 

chr(ord('a')+rand(26)) will return random character of 'a' to 'z'

 

chr(ord('0')+rand(9)) will return random digit of '0' to '9'

 

...

 

 

 

please visit http://perldoc.perl.org/perlfunc.html to learn more about perl functions.