Head scratching regex problem for a newbie
Posted: Sat Aug 23, 2008 12:39 pm
Hi everyone,
I am a complete newbie in regular expression, went thru a lot of regular expression tutorial but still can't figure out how to solve this problem. Maybe regex gurus can help.
What I am trying to do is: from the following text I want to first find a particular pattern of lines. "\\S+([ \\t]+-?[0-9.]+){8}" expression gives me all the lines I am looking for. Out of these lines I want to check if there are two lines that starts with the same word. If such a match is found then I want to add all the high, low, open, close values of the 2nd line to the 1st line and then remove the 2nd line from the text. I hope it doesnt sound too complicated. Is it possible? Or is it too difficult and too much to handle by regex?
e.g. "\\S+([ \\t]+-?[0-9.]+){8}" matches all the stocks from the "A GROUP" stocks to the Spot Transactions. The stock "LEGACYFOOT" is present in two lines (once in Z group and once in spot transaction). I want add the open,high, close, low of LEGACYFOOT in Spot Transaction to the "LEGACYFOOT" in Z category. After that delete the 2nd line of "LEGACYFOOT" occurence from the text.
thanks
omit
the text:(edited to simplify)
DHAKA STOCK EXCHANGE LTD.
TODAY'S SHARE MARKET : 2008-08-21
=================================
(If the page is not updated please press the refresh button)
EQUITY : 745081109873.65
DEBT SECURITIES : 202154936500.00
TOTAL : 947236046373.65
PRICES IN PUBLIC TRANSACTIONS : 2008-08-21
==========================================
A Group
-------
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
1STBSRS 705.00 710.00 686.00 691.25 -.18 85 5650 39.365
1STICB 5200.00 5250.00 5200.00 5224.75 4.22 6 40 2.090
2NDICB 1650.00 1650.00 1561.00 1583.00 -.07 9 75 1.187
3RDICB 1020.25 1036.00 1020.25 1029.50 -.50 6 85 .875
4THICB 1006.25 1050.00 1006.25 1035.00 1.42 11 160 1.656
MIRACLEIND 26.20 27.00 26.10 26.80 3.87 64 60000 15.965
MITHUNKNIT 184.50 185.00 176.00 180.75 .97 21 960 1.739
QSMDRYCELL 37.50 38.50 37.30 38.00 3.26 191 150500 57.158
RAHIMTEXT 390.00 420.00 390.00 410.00 5.12 2 30 .123
RANFOUNDRY 59.50 62.00 58.90 61.50 5.12 126 81000 49.217
UTTARABANK 2849.00 2956.50 2848.00 2900.25 2.94 2892 48130 1404.152
UTTARAFIN 766.00 825.00 766.00 819.75 5.06 179 15350 124.758
----- -------- ---------
----- -------- ---------
55122 12922983 22780.243
"A Group" Scrips traded in Public Market = 146
B Group
-------
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
AGRANINS 213.00 239.00 213.00 226.50 8.76 179 17550 39.805
BDAUTOCA 157.00 159.75 153.00 156.00 -.63 23 875 1.366
NITOLINS 332.25 357.00 332.25 340.00 2.10 67 6250 21.441
SONARBAINS 145.00 153.00 143.75 150.50 6.54 99 11800 17.340
----- -------- ---------
----- -------- ---------
741 223380 154.313
"B Group" Scrips traded in Public Market = 12
G Group
-------
"G Group" Scrips traded in Public Market = 0
N Group
-------
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
CONTININS 228.00 240.00 215.00 231.75 8.29 153 11450 26.258
DBH 1180.00 1249.00 1155.00 1224.75 6.63 94 5150 61.723
MPETROLEUM 131.50 133.00 129.90 130.60 2.03 496 96800 126.857
TITASGAS 354.50 357.75 344.00 350.75 .64 2126 379400 1331.344
----- -------- ---------
----- -------- ---------
3979 742515 1729.643
"N Group" Scrips traded in Public Market = 8
Z Group
-------
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
ALLTEX 68.75 73.00 68.50 71.75 4.36 18 1500 1.078
ANLIMAYARN 50.25 50.25 50.00 50.00 3.62 2 150 .075
LAFSURCEML 568.00 582.00 567.00 577.50 1.27 206 18550 107.275
LEGACYFOOT 14.80 17.00 14.80 16.50 10.73 77 64000 10.261
LEXCO 122.00 124.00 122.00 122.50 4.25 2 70 .086
SHYAMPSUG 10.90 10.90 10.90 10.90 3.80 6 700 .076
SOCIALINV 365.50 375.00 365.00 371.00 2.77 584 52100 193.397
WATACHEM 305.25 312.25 305.25 311.25 4.01 6 180 .560
WONDERTOYS 60.75 62.50 59.25 61.50 2.50 21 2700 1.662
ZEALBANGLA 14.50 14.90 14.50 14.60 .68 7 3900 .570
----- -------- ---------
----- -------- ---------
2888 467200 962.587
"Z Group" Scrips traded in Public Market = 60
===========================
62730 14356078 25626.792
Total number of scrips traded in Public Market = 226
PRICES IN SPOT TRANSACTIONS : 2008-08-21
==========================================
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
LEGACYFOOT 14.80 16.80 16.00 16.50 10.73 9 9000 1.461
PUBALIBANK 859.00 872.75 853.00 857.00 1.48 1216 38105 328.444
----- -------- ---------
----- -------- ---------
1225 47105 329.904
Total number of scrips traded in Spot Market = 2
PRICES IN SPOT TRANSACTIONS (BONDs) : 2008-08-21
==================================================
Total number of BONDs traded in Spot Market = 0
PRICES IN ODDLOT TRANSACTIONS : 2008-08-21
============================================
Instr Code Max Price Min Price Trades Quantity Value(In lakhs)
ABBANK 909.00 902.00 2 4 .036
ACI 475.00 475.00 2 30 .143
AGNISYSL 67.00 60.10 4 540 .345
ALARABANK 465.00 395.00 19 354 1.534
APEXADELFT 2600.00 2600.00 3 30 .780
UTTARABANK 2950.00 2950.00 1 1 .030
UTTARAFIN 800.00 800.00 3 62 .496
------ -------- ------------
------ -------- ------------
438 12122 27.815
Total number of scrips traded in Oddlot = 75
PRICES IN BLOCK TRANSACTIONS : 2008-08-21
===========================================
Total number of scrips traded in Block = 0
I am a complete newbie in regular expression, went thru a lot of regular expression tutorial but still can't figure out how to solve this problem. Maybe regex gurus can help.
What I am trying to do is: from the following text I want to first find a particular pattern of lines. "\\S+([ \\t]+-?[0-9.]+){8}" expression gives me all the lines I am looking for. Out of these lines I want to check if there are two lines that starts with the same word. If such a match is found then I want to add all the high, low, open, close values of the 2nd line to the 1st line and then remove the 2nd line from the text. I hope it doesnt sound too complicated. Is it possible? Or is it too difficult and too much to handle by regex?
e.g. "\\S+([ \\t]+-?[0-9.]+){8}" matches all the stocks from the "A GROUP" stocks to the Spot Transactions. The stock "LEGACYFOOT" is present in two lines (once in Z group and once in spot transaction). I want add the open,high, close, low of LEGACYFOOT in Spot Transaction to the "LEGACYFOOT" in Z category. After that delete the 2nd line of "LEGACYFOOT" occurence from the text.
thanks
omit
the text:(edited to simplify)
DHAKA STOCK EXCHANGE LTD.
TODAY'S SHARE MARKET : 2008-08-21
=================================
(If the page is not updated please press the refresh button)
EQUITY : 745081109873.65
DEBT SECURITIES : 202154936500.00
TOTAL : 947236046373.65
PRICES IN PUBLIC TRANSACTIONS : 2008-08-21
==========================================
A Group
-------
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
1STBSRS 705.00 710.00 686.00 691.25 -.18 85 5650 39.365
1STICB 5200.00 5250.00 5200.00 5224.75 4.22 6 40 2.090
2NDICB 1650.00 1650.00 1561.00 1583.00 -.07 9 75 1.187
3RDICB 1020.25 1036.00 1020.25 1029.50 -.50 6 85 .875
4THICB 1006.25 1050.00 1006.25 1035.00 1.42 11 160 1.656
MIRACLEIND 26.20 27.00 26.10 26.80 3.87 64 60000 15.965
MITHUNKNIT 184.50 185.00 176.00 180.75 .97 21 960 1.739
QSMDRYCELL 37.50 38.50 37.30 38.00 3.26 191 150500 57.158
RAHIMTEXT 390.00 420.00 390.00 410.00 5.12 2 30 .123
RANFOUNDRY 59.50 62.00 58.90 61.50 5.12 126 81000 49.217
UTTARABANK 2849.00 2956.50 2848.00 2900.25 2.94 2892 48130 1404.152
UTTARAFIN 766.00 825.00 766.00 819.75 5.06 179 15350 124.758
----- -------- ---------
----- -------- ---------
55122 12922983 22780.243
"A Group" Scrips traded in Public Market = 146
B Group
-------
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
AGRANINS 213.00 239.00 213.00 226.50 8.76 179 17550 39.805
BDAUTOCA 157.00 159.75 153.00 156.00 -.63 23 875 1.366
NITOLINS 332.25 357.00 332.25 340.00 2.10 67 6250 21.441
SONARBAINS 145.00 153.00 143.75 150.50 6.54 99 11800 17.340
----- -------- ---------
----- -------- ---------
741 223380 154.313
"B Group" Scrips traded in Public Market = 12
G Group
-------
"G Group" Scrips traded in Public Market = 0
N Group
-------
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
CONTININS 228.00 240.00 215.00 231.75 8.29 153 11450 26.258
DBH 1180.00 1249.00 1155.00 1224.75 6.63 94 5150 61.723
MPETROLEUM 131.50 133.00 129.90 130.60 2.03 496 96800 126.857
TITASGAS 354.50 357.75 344.00 350.75 .64 2126 379400 1331.344
----- -------- ---------
----- -------- ---------
3979 742515 1729.643
"N Group" Scrips traded in Public Market = 8
Z Group
-------
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
ALLTEX 68.75 73.00 68.50 71.75 4.36 18 1500 1.078
ANLIMAYARN 50.25 50.25 50.00 50.00 3.62 2 150 .075
LAFSURCEML 568.00 582.00 567.00 577.50 1.27 206 18550 107.275
LEGACYFOOT 14.80 17.00 14.80 16.50 10.73 77 64000 10.261
LEXCO 122.00 124.00 122.00 122.50 4.25 2 70 .086
SHYAMPSUG 10.90 10.90 10.90 10.90 3.80 6 700 .076
SOCIALINV 365.50 375.00 365.00 371.00 2.77 584 52100 193.397
WATACHEM 305.25 312.25 305.25 311.25 4.01 6 180 .560
WONDERTOYS 60.75 62.50 59.25 61.50 2.50 21 2700 1.662
ZEALBANGLA 14.50 14.90 14.50 14.60 .68 7 3900 .570
----- -------- ---------
----- -------- ---------
2888 467200 962.587
"Z Group" Scrips traded in Public Market = 60
===========================
62730 14356078 25626.792
Total number of scrips traded in Public Market = 226
PRICES IN SPOT TRANSACTIONS : 2008-08-21
==========================================
Instr Code Open High Low Close %Chg Trade Volume Value(Lc)
LEGACYFOOT 14.80 16.80 16.00 16.50 10.73 9 9000 1.461
PUBALIBANK 859.00 872.75 853.00 857.00 1.48 1216 38105 328.444
----- -------- ---------
----- -------- ---------
1225 47105 329.904
Total number of scrips traded in Spot Market = 2
PRICES IN SPOT TRANSACTIONS (BONDs) : 2008-08-21
==================================================
Total number of BONDs traded in Spot Market = 0
PRICES IN ODDLOT TRANSACTIONS : 2008-08-21
============================================
Instr Code Max Price Min Price Trades Quantity Value(In lakhs)
ABBANK 909.00 902.00 2 4 .036
ACI 475.00 475.00 2 30 .143
AGNISYSL 67.00 60.10 4 540 .345
ALARABANK 465.00 395.00 19 354 1.534
APEXADELFT 2600.00 2600.00 3 30 .780
UTTARABANK 2950.00 2950.00 1 1 .030
UTTARAFIN 800.00 800.00 3 62 .496
------ -------- ------------
------ -------- ------------
438 12122 27.815
Total number of scrips traded in Oddlot = 75
PRICES IN BLOCK TRANSACTIONS : 2008-08-21
===========================================
Total number of scrips traded in Block = 0