Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Perl (http://www.go4expert.com/forums/perl/)
-   -   Comparing two data files & Writing an Outfile (http://www.go4expert.com/forums/comparing-data-files-writing-outfile-t26049/)

filter 13Jun2011 22:02

Comparing two data files & Writing an Outfile
 
Hi Everyone,

I have two files i.e. one file2 contains today's data and the other file1 contains Yesterday's data.

The data in the files contains 226 columns and the data for the coulums separated by a Pipe "|" delimiter.

Now, I have 4 Primary keys (coulumns) by which I have to compare file2 and file1 and generate a diff file that is the data that is not present in file2 but present in file1 and vice versa.

The sample of the data that contains in file1 is ( 1 record):

Code:

START-OF-FILE
FILENAME=fixedincome_bo_namr.dif
DATA=bo
REGION=namr
TYPE=dif
PROGRAMNAME=getdata
DATEFORMAT=yyyymmdd

START-OF-FIELDS
# Security description
TICKER
CPN
MATURITY
SERIES
NAME
SHORT_NAME
ISSUER_INDUSTRY
MARKET_SECTOR_DES
CPN_FREQ
CPN_TYP
MTY_TYP
CALC_TYP_DES
DAY_CNT
MARKET_ISSUE
COUNTRY
CRNCY
COLLAT_TYP
AMT_ISSUED
AMT_OUTSTANDING
MIN_PIECE
MIN_INCREMENT
PAR_AMT
LEAD_MGR
EXCH_CODE
REDEMP_VAL

# Issuance information
ANNOUNCE_DT
FIRST_SETTLE_DT
FIRST_CPN_DT
INT_ACC_DT
ISSUE_DT
ISSUE_PX

# Identifiers
ID_EUROCLEAR
ID_XTRAKTER
ID_SEDOL1
ID_SEDOL2
ID_CEDEL
ID_WERTPAPIER
ID_ISIN
ID_DUTCH
ID_VALOREN
ID_FRENCH
ID_COMMON
ID_JAPAN
ID_BELGIUM
ID_DANISH
ID_AUSTRIAN
ID_LUXEMBOURG
ID_SWEDISH
ID_NORWAY
ID_JAPAN_COMPANY
ID_SPAIN
ID_ITALY
ID_BB_COMPANY
ID_BB_SECURITY
ID_CUSIP

# Schedules
NXT_CALL_DT
NXT_CALL_PX
NXT_PAR_CALL_DT
NXT_PUT_DT
NXT_PUT_PX
NXT_PAR_PUT_DT
NXT_CPN_DT
NXT_SINK_DT
NXT_REFUND_DT

# Ratings
RTG_JCR

# Floaters
REFIX_FREQ
NXT_REFIX_DT
RESET_IDX

# Inflation-indexed bonds

# Preferreds
PFD_DVD_PAY_DT
PFD_EX_DVD_DT
PFD_RST_DVD

# Convertibles
CV_COMMON_TICKER
CV_COMMON_TICKER_EXCH
CV_CNVS_RATIO
CV_UNTIL
CV_CNVS_FEXCH_RT
CV_PROV_PX

CALLABLE
SINKABLE
PUTABLE

ID_BB_PARENT_CO
PARENT_COMP_NAME
PARENT_COMP_TICKER
CNTRY_OF_INCORPORATION
BASIC_SPREAD
INDUSTRY_SECTOR
INDUSTRY_GROUP
INDUSTRY_SUBGROUP
COUNTRY_GUARANTOR
CNTRY_OF_DOMICILE
SECURITY_DES
144A_FLAG
FLT_BENCH_MULTIPLIER
SECURITY_TYP
CV_START_DT
GUARANTOR_NAME
PREV_CPN_DT
NXT_SINK_AMT
CALL_DISCRETE
PUT_DISCRETE
MAKE_WHOLE_CALL
ID_BB_UNIQUE
LONG_COMP_NAME
REDEMP_CRNCY
CPN_CRNCY
DTC_ELIGIBLE
STRUCTURED_NOTE
PCT_PAR_QUOTED
PCS_QUOTE_TYP
IS_UNIT_TRADED
IS_REVERSE_CONVERTIBLE
TRADE_CRNCY
BEARER
REGISTERED
CALLED
CALLED_DT
ISSUER
CALL_FEATURE
PUT_FEATURE
PENULTIMATE_CPN_DT
FLT_CPN_CONVENTION
CUR_CPN
FLOATER
TRADE_STATUS
CDR_COUNTRY_CODE
CDR_SETTLE_CODE
SEASONING_STATUS
FINAL_MATURITY
PRVT_PLACE
CALC_TYP
REMOVAL_REASON
IS_PERPETUAL
IS_REG_S
CALLED_PX
DEFAULTED
GILTS_EX_DVD_DT
MOST_RECENT_REPORTED_FACTOR
NXT_FACTOR_DT
OID_BOND
DELIVERY_TYP
ID_SEDOL3
ID_SEDOL4
ID_SEDOL5
SEDOL1_COUNTRY_ISO
SEDOL2_COUNTRY_ISO
SEDOL3_COUNTRY_ISO
SEDOL4_COUNTRY_ISO
SEDOL5_COUNTRY_ISO
ID_MIC1
ID_MIC2
ID_MIC3
ID_MIC4
ID_MIC5
CV_SH_PAR
DUAL_CRNCY
EXTENDIBLE
EXCHANGEABLE
IS_SOFT_CALL
CV_MANDATORY_CNVS
EU_SAVINGS_DIRECTIVE
ID_CUSIP_REAL
INDUSTRY_SUBGROUP_NUM
SECURITY_TYP2
LAST_REFIX_DT
ISO_COUNTRY_GUARANTOR
DTC_REGISTERED
CALL_PARTIAL
CV_CNVS_PX
IS_CURRENT_GOVT
FIRST_CALL_DT_ISSUANCE
UNDL_ID_BB_UNIQUE
EST_CPN_FLAG
ID_BB_GUARANTOR
IS_DAY_PAYER
STEPUP_CPN
STEPUP_DT
CALC_MATURITY
ID_EXCH_SYMBOL
CREDIT_ENHANCEMENTS
INSURANCE_STATUS
JUNIOR
SENIOR
FLT_PAY_DAY
FLT_DAYS_PRIOR
INFLATION_LINKED_INDICATOR
DAYS_TO_SETTLE
TYPE_OF_BOND
REFERENCE_INDEX
BASE_CPI
CFI_CODE
CPN_FREQ_YLD_CNV
DAY_PAYER_FREQ
EX_DIV_DAYS
EX_DIV_CALENDAR
CONTINGENT_CONVERSION
CONTRIB_DATA_INDICATOR
SECURITY_FACTORABLE
ID_BB_GLOBAL
END-OF-FIELDS

TIMESTARTED=Wed Jun  8 18:33:51 EDT 2011
  230 START-OF-DATA
  231 CP5101987 Corp|-1|198|C|7.300000|20110607|EAB|EUROPEAN AMERICAN BANK|LNB-CALL12/97|BANK|Corp|2|STEP CPN|CALLABLE|MULTI-STEP BOND|1|US DOMESTIC|US|USD|DE      POSIT NOTES|760000.00|.00|10000.0000|1000.0000|1000.00|LAS-sole|NOT LISTED|100.00000|19960607|19960607|19961207|19960607|19960607|100.000000| | | | | |      |US29874AZZ55| | | | | | | | | | | | | | |225433|500231|29874AZZ5| | | | | | | | | |N.A.| | | | | | | | | | | | |Y|N|N|197879|Citibank NA|8156Z|US| |Fin      ancial|Banks|Money Center Banks|N.A.|US|C 7.3 06/07/11|N| |US DOMESTIC| |N.A.| | |Y| |N|COCP5101987|European American Bank|USD|USD|Y|Y|Y|1|N|N|USD|N|N|Y      |19971207|EUROPEAN AMERICAN BANK|Semi-Annual| |19970607| | |N|N|US|US|Does Not Apply|20110607|N|421|MATURED|N|N|100.000000|N| |.000000000| |N|DTC| | | |      N.A.|N.A.|N.A.|N.A.|N.A.| | | | | | |N|N|N|N| |Grandfathered|29874AZZ5|145| | |N.A.|N|FULL (ONLY)| |N|19970607| | | |N| | |20110607| | |N|N|N| | |N|3| |      | |N.A.|2| | | | |N|N|BBG00048KQJ7|

The sample of the data that contains in file2 (Todays file) is ( 1 record):

Code:

START-OF-FILE
FILENAME=fixedincome_bo_namr.dif
DATA=bo
REGION=namr
TYPE=dif
PROGRAMNAME=getdata
DATEFORMAT=yyyymmdd

START-OF-FIELDS
# Security description
TICKER
CPN
MATURITY
SERIES
NAME
SHORT_NAME
ISSUER_INDUSTRY
MARKET_SECTOR_DES
CPN_FREQ
CPN_TYP
MTY_TYP
CALC_TYP_DES
DAY_CNT
MARKET_ISSUE
COUNTRY
CRNCY
COLLAT_TYP
AMT_ISSUED
AMT_OUTSTANDING
MIN_PIECE
MIN_INCREMENT
PAR_AMT
LEAD_MGR
EXCH_CODE
REDEMP_VAL

# Issuance information
ANNOUNCE_DT
FIRST_SETTLE_DT
FIRST_CPN_DT
INT_ACC_DT
ISSUE_DT
ISSUE_PX

# Identifiers
ID_EUROCLEAR
ID_XTRAKTER
ID_SEDOL1
ID_SEDOL2
ID_CEDEL
ID_WERTPAPIER
ID_ISIN
ID_DUTCH
ID_VALOREN
ID_FRENCH
ID_COMMON
ID_JAPAN
ID_BELGIUM
ID_DANISH
ID_AUSTRIAN
ID_LUXEMBOURG
ID_SWEDISH
ID_NORWAY
ID_JAPAN_COMPANY
ID_SPAIN
ID_ITALY
ID_BB_COMPANY
ID_BB_SECURITY
ID_CUSIP

# Schedules
NXT_CALL_DT
NXT_CALL_PX
NXT_PAR_CALL_DT
NXT_PUT_DT
NXT_PUT_PX
NXT_PAR_PUT_DT
NXT_CPN_DT
NXT_SINK_DT
NXT_REFUND_DT

# Ratings
RTG_JCR

# Floaters
REFIX_FREQ
NXT_REFIX_DT
RESET_IDX

# Inflation-indexed bonds

# Preferreds
PFD_DVD_PAY_DT
PFD_EX_DVD_DT
PFD_RST_DVD

# Convertibles
CV_COMMON_TICKER
CV_COMMON_TICKER_EXCH
CV_CNVS_RATIO
CV_UNTIL
CV_CNVS_FEXCH_RT
CV_PROV_PX

CALLABLE
SINKABLE
PUTABLE

ID_BB_PARENT_CO
PARENT_COMP_NAME
PARENT_COMP_TICKER
CNTRY_OF_INCORPORATION
BASIC_SPREAD
INDUSTRY_SECTOR
INDUSTRY_GROUP
INDUSTRY_SUBGROUP
COUNTRY_GUARANTOR
CNTRY_OF_DOMICILE
SECURITY_DES
144A_FLAG
FLT_BENCH_MULTIPLIER
SECURITY_TYP
CV_START_DT
GUARANTOR_NAME
PREV_CPN_DT
NXT_SINK_AMT
CALL_DISCRETE
PUT_DISCRETE
MAKE_WHOLE_CALL
ID_BB_UNIQUE
LONG_COMP_NAME
REDEMP_CRNCY
CPN_CRNCY
DTC_ELIGIBLE
STRUCTURED_NOTE
PCT_PAR_QUOTED
PCS_QUOTE_TYP
IS_UNIT_TRADED
IS_REVERSE_CONVERTIBLE
TRADE_CRNCY
BEARER
REGISTERED
CALLED
CALLED_DT
ISSUER
CALL_FEATURE
PUT_FEATURE
PENULTIMATE_CPN_DT
FLT_CPN_CONVENTION
CUR_CPN
FLOATER
TRADE_STATUS
CDR_COUNTRY_CODE
CDR_SETTLE_CODE
SEASONING_STATUS
FINAL_MATURITY
PRVT_PLACE
CALC_TYP
REMOVAL_REASON
IS_PERPETUAL
IS_REG_S
CALLED_PX
DEFAULTED
GILTS_EX_DVD_DT
MOST_RECENT_REPORTED_FACTOR
NXT_FACTOR_DT
OID_BOND
DELIVERY_TYP
ID_SEDOL3
ID_SEDOL4
ID_SEDOL5
SEDOL1_COUNTRY_ISO
SEDOL2_COUNTRY_ISO
SEDOL3_COUNTRY_ISO
SEDOL4_COUNTRY_ISO
SEDOL5_COUNTRY_ISO
ID_MIC1
ID_MIC2
ID_MIC3
ID_MIC4
ID_MIC5
CV_SH_PAR
DUAL_CRNCY
EXTENDIBLE
EXCHANGEABLE
IS_SOFT_CALL
CV_MANDATORY_CNVS
EU_SAVINGS_DIRECTIVE
ID_CUSIP_REAL
INDUSTRY_SUBGROUP_NUM
SECURITY_TYP2
LAST_REFIX_DT
ISO_COUNTRY_GUARANTOR
DTC_REGISTERED
CALL_PARTIAL
CV_CNVS_PX
IS_CURRENT_GOVT
FIRST_CALL_DT_ISSUANCE
UNDL_ID_BB_UNIQUE
EST_CPN_FLAG
ID_BB_GUARANTOR
IS_DAY_PAYER
STEPUP_CPN
STEPUP_DT
CALC_MATURITY
ID_EXCH_SYMBOL
CREDIT_ENHANCEMENTS
INSURANCE_STATUS
JUNIOR
SENIOR
FLT_PAY_DAY
FLT_DAYS_PRIOR
INFLATION_LINKED_INDICATOR
DAYS_TO_SETTLE
TYPE_OF_BOND
REFERENCE_INDEX
BASE_CPI
CFI_CODE
CPN_FREQ_YLD_CNV
DAY_PAYER_FREQ
EX_DIV_DAYS
EX_DIV_CALENDAR
CONTINGENT_CONVERSION
CONTRIB_DATA_INDICATOR
SECURITY_FACTORABLE
ID_BB_GLOBAL
END-OF-FIELDS

 TIMESTARTED=Thu Jun  9 18:34:19 EDT 2011
 230 START-OF-DATA
 231 9999X01M9 Govt|-1|198|XIB|0|20110707| |WI TSY BILL|WI TSY BILL|USGN|Govt| |NONE|NORMAL|DISCOUNT|2|US GOVT|US|USD| |31782000000|31782000000|100|100| | | |    100.00000| | |20110106| |20110609|0.005000| | | | | | | | | | | | | | | | | | | | | |349057|13714872|9127952X8| | | | | | | | | | | | | | | | | | | | | |      |N|N|N|218252|United States of America|3352Z|US| |Government|Sovereign|Sovereign| |US|XIB 07/07/11| | |US GOVERNMENT| | | | | | |N.A.|GV9999X01M9|United      States Treasury Bill - WI Post Auction|USD|USD| | |Y|2|N| |USD|N.A.|N.A.|N| |WI TSY BILL| | | | | |N|Y|US|US| |20110707|N.A.|5| | | | | | | | |Y| | | |      |N.A.|N.A.|N.A.|N.A.|N.A.| | | | | | | | |N.A.|N| |Non-Grandfathered|9999X01M9|459|Bill| | | | | |N| | | | |N| | |20110707| | | | | | | |N|1| | | | | | |      | | | |N|BBG001CSH9Y7|


Since I have 4 primary keys, I though of doing the code using Hashes.

Could someone please help me out. I am looking for a desperate help on this.

Really appreciated if someone could help out with some ideas.


All times are GMT +5.5. The time now is 06:05.