I had requirement to Add New column to Table and update with Default value . The table having 65 Million rows Which is very big table to add and update the column. After trying multiple methods I decided to use Below method for my situation .
Step 1) first Step Add Column with out default value.
Alter table table_name Add column (New_column_name Varchar2(256));
Step 2) Now we need to update the column with Default value 9999 . But we cannot execute direct update statement it will fail due to roll back segment and snap shot to old error.
So I decided to update the certain amount of rows and commit each time until finish.
To make all rows into buckets or chucks equal size chucks I used below NTIL Command .
> cat split_rowid.sql
spool /xxxxxx/XXXXXX/XXXXX/XXXXXXntile_bucket.list
set echo off feedback off trimspool on linesize 200 pages 0 heading off
select min(ROW_ID)||'|'||max(ROW_ID)||'|'||count(*)||'|'||grp_nbr
from
(
select ROW_ID, ntile(256) over (order by ROW_ID) grp_nbr
from (select rowid row_id from SCHEMA_NAME.TABLE_NAME )
)
group by grp_nbr order by grp_nbr;
spool off;
Output :-
------------
Step 1) first Step Add Column with out default value.
Alter table table_name Add column (New_column_name Varchar2(256));
Step 2) Now we need to update the column with Default value 9999 . But we cannot execute direct update statement it will fail due to roll back segment and snap shot to old error.
So I decided to update the certain amount of rows and commit each time until finish.
To make all rows into buckets or chucks equal size chucks I used below NTIL Command .
> cat split_rowid.sql
spool /xxxxxx/XXXXXX/XXXXX/XXXXXXntile_bucket.list
set echo off feedback off trimspool on linesize 200 pages 0 heading off
select min(ROW_ID)||'|'||max(ROW_ID)||'|'||count(*)||'|'||grp_nbr
from
(
select ROW_ID, ntile(256) over (order by ROW_ID) grp_nbr
from (select rowid row_id from SCHEMA_NAME.TABLE_NAME )
)
group by grp_nbr order by grp_nbr;
spool off;
Output :-
------------
Step 3) Now we need to create shell script like update the rows between above row ids ( starting and ending row ids ) and Commit after each chuck completed. and we can use parallelism to pick at a time few chucks and complete until finish all chunks.
-------------------------------
Below is the shell Script :-
-------------------------------
cat run_load_in_parallel.ksh
#!/bin/ksh
#########################################################################
#
# File : run_load_in_parallel.ksh
# Purpose : To Load Data in parallel
# Parameters :
# example: run_load_in_parallel.ksh 32 (execute 32 parallel inserts)
#
#
#########################################################################
. /home/oracle/.bash_profile
export ORACLE_SID=DWEBAPPS1
export ORAENV_ASK=NO
. /usr/local/bin/oraenv
FILE_DIR=/xxxxxx/XXXXXX
BUCKETFILE=${FILE_DIR}/ntile_bucket.list
LOGFILE=${FILE_DIR}/run_load_in_parallel_`date +%m%d%y_%H%M%S`.log
THR=$1
test ${THR}||export THR=32
cd ${FILE_DIR}
#sqlplus -s / as sysdba < ${BUCKETFILE} #set echo off feedback off trimspool on linesize 200 pages 0 heading off #select min(ROW_ID)||'|'||max(ROW_ID)||'|'||count(*)||'|'||grp_nbr
#from
#(
#select ROW_ID, ntile(256) over (order by ROW_ID) grp_nbr #from (select rowid row_id from schema_name.table_name )
#)
#group by grp_nbr order by grp_nbr;
#EOF
for LINE in `cat $BUCKETFILE`
do
while true
do
cnt=`ps -ef |grep sqlplus |grep -v grep |grep sunil_data|wc -l`
if [ $cnt -ge $THR ] ; then
echo "`date` : Not executing since $cnt inserts are running, so sleeping for 10 seconds" >> $LOGFILE
sleep 10
else
MINROWID="`echo $LINE|cut -d'|' -f1`"
MAXROWID="`echo $LINE|cut -d'|' -f2`"
GRPNBR="`echo $LINE|cut -d'|' -f4`"
echo "`date` : Starting insert of ${GRPNBR}" >> $LOGFILE
echo "set echo on feedback on time on timing on " > sunil_data_${GRPNBR}.sql
echo "spool sunil_data_${GRPNBR}.log " >> sunil_data_${GRPNBR}.sql
echo "UPDATE SCHEMA_NAME.TABLE_NAME SET NEWLY_ADDED_COLUMN = '99999' WHERE rowid between '${MINROWID}' and '${MAXROWID}';" >> sunil_data_${GRPNBR}.sql
echo "commit;" >> sunil_data_${GRPNBR}.sql
echo "spool off" >> sunil_data_${GRPNBR}.sql
echo password |sqlplus -s skumar@tnsname @sunil_data_${GRPNBR}.sql &
sleep 1
break
fi
done
done
wait
echo "`date` : All inserts completed " >> $LOGFILE
Step 4) To run Script
nohup ./run_load_in_parallel.ksh 32 &
In Above 32 is parallels . It will start pick first 32 chucks and process and will pick another chucks one by one until finishes all the chucks.
Row id updates Method is more efficient method for large tables which are having LOB COLUMNS as well. You can use Same method to copy data from one table to another table .
No comments:
Post a Comment