Help - Search - Members - Calendar
Full Version: getting random results creating table
Oracle DBA Forums > Oracle > Oracle Forum
jwolgast
Ok, in a nutshell I am trying to create a csv file with summarized data. We have a huge table with claim details that is constantly being updated. I am pulling a subset of records that match my criteria into a tempory table (not technically an Oracle temporary table, a regular table that will only exist until I drop it when I am done). This table has multiple entries per claim with different effective dates, paid dates and amounts paid. The result file needs to have one entry per claim with the oldest effective and paid dates and the total of all of the amounts paid on that claim.

Originally I was doing create table new_table as select claim_nbr,other data...,min(ymdeff),min(ymdpaid),sum(amtpay) from my_table group by claim_nbr,other data...

If I ran a select sum(amtpay) from my_table and select sum(amtpay) from new_table I was not getting the same results... If I ran select count (*) from (select distinct claim_nbr from my_table) and select count (*) from (select distinct claim_nbr from new_table) or select count (*) from new_table I was getting the same number of rows. So I wasn't completely losing claims from one table to the next, just some of the details. So, I tried running this:

select * from
(select claim_nbr,sum(amtpay) paysum from my_table
group by claim_nbr
order by claim_nbr) m,
(select claim_nbr,sum(amtpay) paysum from new_table
group by claim_nbr
order by claim_nbr) n
where
m.claim_nbr = n.claim_nbr and
m.paysum <> n.paysum;

It came back with the claim number causing the issue. I looked at all the entries in my_table for that claim and every field was identical except the ymdeff, ymdpaid and amtpay. There were 4 records in my_table however the amtpay in new_table was only a sum of 2 of the records... I our admin look over my shoulder to see what was wrong and they wanted me to recreate new_table. So I dropped new_table and ran the exact same SQL to recreate the table. The number of distinct claim numbers was still the same in both tables and the sum of new_table was off but not by the same amount. I ran my comparison to see which claim was off and now there were two claims where the totals didn't match and neither were the same as the claim that was wrong that first time. We dropped new_table and recreated it several times and every time we got different results... No one else knows the name of my_table so no one was messing with it at the same time plus the sum of amtpay in my_table always comes back the same.

Our admin said he thought he remembered there being something "funny" with the min function sometime so he had me remove those fields. Ran the query several times and the total came out correct each time. Well I still need the dates so I came up with another way (very convoluted) using subqueries and ranking. It seemed to work at first then it started losing random numbers of claims (fewer rows in new_table than distinct claims in my_table) or keeping all the claims but dropping detail lines like I had using the min functions.

The admin is stumped, I've Googled but I'm not even sure exactly how to search for this kind of problem... Any suggestions or help would be appreciated, TIA!!!!

Here is the backwards way around using min that drops whole claims sometimes but works fine other times:
CREATE table new_table
as
(select claim_nbr,other data...,amtpay,ymdeff,ymdpaid
from
(select claim_nbr,other data... ,sum(amtpay) amtpay
from my_table
group by claim_nbr,other data...
order by claim_nbr) a,
(select claim_nbr_b,ymdeff,ymdpaid from
(select distinct claim_nbr_b,ymdeff from
(select claim_nbr claim_nbr_b,ymdeff,rank() over (partition by claim_nbr order by ymdeff) rankb
from my_table)
where rankb=1) sub_a,
(select distinct claim_nbr_c,ymdpaid from
(select claim_nbr claim_nbr_c,ymdpaid,rank() over (partition by claim_nbr order by ymdpaid) rankc
from my_table)
where rankc=1) sub_b
where claim_nbr_b = claim_nbr_c
) b
where
claim_nbr=claim_nbr_b);
dvsoukup
Hi Jwolgast,

Saw this and wanted to point something out that I noticed....though I don't think this will lead to the ultimate answer of "does the min() function work properly?".


This might be your first problem here, when you create that new_table:

create table new_table as select claim_nbr,other data...,min(ymdeff),min(ymdpaid),sum(amtpay) from my_table group by claim_nbr,other data...



Have you double-checked that when you do a select claim, other data...,min(ymdeff),min(ymdpaid),sum(amtpay) from my_table that you do in fact get unique claim numbers back per row? First red flag that I would look into is all that 'other data'. If that other data is different, then it would casue additional rows of the same claim number to be returned, which would change what it truly finds for the 'minimum value', because you're GROUPING by that other data.

Of course, if you do a select distinct claim from each table, you're going to get the same results back no matter what, as in that case it doesn't care how many rows of each claim exist. I wouldn't recommend using that as a validation check in this case.

Try running this and see what you get:

select claim_nbr, count(*) from (
select claim_nbr,other data...,min(ymdeff),min(ymdpaid),sum(amtpay) from my_table group by claim_nbr,other data)
group by claim_nbr
having count(*) >1;

See if that returns anything. If so, that's probably the reason why your data isn't matching up right. Just shooting in the dark here though based off of what you wrote...hopefully it helps smile.gif
jwolgast
QUOTE (dvsoukup @ Jan 19 2012, 02:51 PM) *
Hi Jwolgast,

Saw this and wanted to point something out that I noticed....though I don't think this will lead to the ultimate answer of "does the min() function work properly?".


This might be your first problem here, when you create that new_table:

create table new_table as select claim_nbr,other data...,min(ymdeff),min(ymdpaid),sum(amtpay) from my_table group by claim_nbr,other data...



Have you double-checked that when you do a select claim, other data...,min(ymdeff),min(ymdpaid),sum(amtpay) from my_table that you do in fact get unique claim numbers back per row? First red flag that I would look into is all that 'other data'. If that other data is different, then it would casue additional rows of the same claim number to be returned, which would change what it truly finds for the 'minimum value', because you're GROUPING by that other data.

Of course, if you do a select distinct claim from each table, you're going to get the same results back no matter what, as in that case it doesn't care how many rows of each claim exist. I wouldn't recommend using that as a validation check in this case.

Try running this and see what you get:

select claim_nbr, count(*) from (
select claim_nbr,other data...,min(ymdeff),min(ymdpaid),sum(amtpay) from my_table group by claim_nbr,other data)
group by claim_nbr
having count(*) >1;

See if that returns anything. If so, that's probably the reason why your data isn't matching up right. Just shooting in the dark here though based off of what you wrote...hopefully it helps smile.gif



Thanks for the suggestion but I did verify that "other data..." really does match up exactly and I also mentioned that running select count (*) from new_table matches the select distinct from my_table sad.gif

Also the issue always seems to be too little info in the new_table, not too much. The sum is always less or with the second SQL the row count is sometimes less, I never end up with more in the new_table than I had in my_table...

I appreciate your time though, thanks for looking! smile.gif
jwolgast
I'll also add that while this SQL always gives a random number of rows,

create table new_table
as
(select claim_nbr,other data...,amtpay,ymdeff,ymdpaid
from
(select claim_nbr,other data... ,sum(amtpay) amtpay
from my_table
group by claim_nbr,other data...
order by claim_nbr) a,
(select claim_nbr_b,ymdeff,ymdpaid from
(select distinct claim_nbr_b,ymdeff from
(select claim_nbr claim_nbr_b,ymdeff,rank() over (partition by claim_nbr order by ymdeff) rankb
from my_table)
where rankb=1) sub_a,
(select distinct claim_nbr_c,ymdpaid from
(select claim_nbr claim_nbr_c,ymdpaid,rank() over (partition by claim_nbr order by ymdpaid) rankc
from my_table)
where rankc=1) sub_b
where claim_nbr_b = claim_nbr_c
) b
where
claim_nbr=claim_nbr_b);

This SQL always matches the number of distinct claims in my_table

select count (*) from
(select claim_nbr,other data...,amtpay,ymdeff,ymdpaid
from
(select claim_nbr,other data... ,sum(amtpay) amtpay
from my_table
group by claim_nbr,other data...
order by claim_nbr) a,
(select claim_nbr_b,ymdeff,ymdpaid from
(select distinct claim_nbr_b,ymdeff from
(select claim_nbr claim_nbr_b,ymdeff,rank() over (partition by claim_nbr order by ymdeff) rankb
from my_table)
where rankb=1) sub_a,
(select distinct claim_nbr_c,ymdpaid from
(select claim_nbr claim_nbr_c,ymdpaid,rank() over (partition by claim_nbr order by ymdpaid) rankc
from my_table)
where rankc=1) sub_b
where claim_nbr_b = claim_nbr_c
) b
where
claim_nbr=claim_nbr_b);

I even tried creating new_table2 just to make sure there wasn't some rogue caching or something going on by dropping and recreating with the same table name each time but I'm still getting random results...

dvsoukup
QUOTE (jwolgast @ Jan 19 2012, 02:36 PM) *
Thanks for the suggestion but I did verify that "other data..." really does match up exactly and I also mentioned that running select count (*) from new_table matches the select distinct from my_table sad.gif

Also the issue always seems to be too little info in the new_table, not too much. The sum is always less or with the second SQL the row count is sometimes less, I never end up with more in the new_table than I had in my_table...

I appreciate your time though, thanks for looking! smile.gif



Hmmmm. Tough one indeed. This might be a silly question but...

What is the datatype of those ymdeff and ymdpaid? If they just happened to be varchar2/char data types, but stores in a 'date looking format', then using the min function would surely not work right here. Bad design on the database part...but you just never know...

Though that still wouldn't explain why you constantly get different values each time you run your checks.
burleson
>> The number of distinct claim numbers was still the same in both tables and the sum of new_table was off but not by the same amount

Is this reproduceable?

Read consistency only applies from the time a specific job starts until EOJ.

In a high active table, the results will change between runs.

Anyway, check MOSC since you may have hit a bug.

http://support.oracle.com

I don't know the release you are on, but the r1 of 10g and 11g both have bugs and patches for incorrect results:

http://www.google.com/search?sourceid=ie7&...011&bih=521
.13.7.1.1l37l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=8b6c04f6f6b326e6&biw=1011&bih=521
jwolgast
QUOTE (dvsoukup @ Jan 19 2012, 08:08 PM) *
Hmmmm. Tough one indeed. This might be a silly question but...

What is the datatype of those ymdeff and ymdpaid? If they just happened to be varchar2/char data types, but stores in a 'date looking format', then using the min function would surely not work right here. Bad design on the database part...but you just never know...

Though that still wouldn't explain why you constantly get different values each time you run your checks.


Both ymd fields are not null number(8)... Our admin had me use toChar with the min to see if that made any difference and it did not.
jwolgast
QUOTE (burleson @ Jan 20 2012, 07:35 AM) *
>> The number of distinct claim numbers was still the same in both tables and the sum of new_table was off but not by the same amount

Is this reproduceable?

Read consistency only applies from the time a specific job starts until EOJ.

In a high active table, the results will change between runs.

Anyway, check MOSC since you may have hit a bug.

http://support.oracle.com

I don't know the release you are on, but the r1 of 10g and 11g both have bugs and patches for incorrect results:

http://www.google.com/search?sourceid=ie7&...011&bih=521
.13.7.1.1l37l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=8b6c04f6f6b326e6&biw=1011&bih=521


This is reproducable, every time I drop new_table and recreate it without making any changes to the SQL used to create it I get different results. I tried 6 times in a row with the admin sitting with me and we got different (wrong) results every single time.

It looks like we are on Oracle 10.2.0.1.0, I'm a programmer so I'm not exactly sure how to interpret the version number but I don't think we are on an r1... I'm also not sure how to search for this kind of problem on Oracle's site. I tried several searches there before I posted here but didn't find anything like what I am seeing. I saw people with issues with views and such but this is a static table I'm trying to pull from. Also this table is not very large (about 47,000 rows) and is not being touched by anyone but me. I created it and no one else knows the name (I've been using my_table and new_table for simplicity, I am working on this under an internal request number and the actual table names contain my request number which no one else is working on). Thanks for the suggestions though!
burleson
Hi Tom,

>> I'm a programmer so I'm not exactly sure how to interpret the version number but I don't think we are on an r1...

Does your shop have a DBA?

***********************************************
>> It looks like we are on Oracle 10.2.0.1.0,

Perfect!

Here is the bugs fixed in 10.2.0.4:

http://www.eygle.com/Notes/10204_buglist.htm

There is more than 100 bugs that say "wrong results":

Either you, your DBA, or somebody with your CSI number (customer support ID) needs to open a service request and search the bugs database.

It may not be a bug, but then again, it smells fishy . . . .
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2014 Invision Power Services, Inc.