if you have month and year as separate columns…

An easy confusion of logic was succinctly cleared up by Toon Koppelaars on oracle-l today. It *may* be helpful to read that thread before you read this post, but I hope it stands alone.

The original poster wanted to exclude only month number 5 from year 2012, but was perplexed that
"where (month != 5 and year != 2012)"
also excluded the 5th month of 2011 and all of the data from 2012.

Toon referenced and explained from De Morgan’s Laws that “not X and not Y” is equivalent to “not (X or Y)”, which would be “not (month=5 or year=2012) in your first query.

For those who are truth table challenged, it may be helpful to consider this (regardless of the actual plan Oracle might choose) as a filter operation. So you get a row, and if it is either that nasty 5th month you apparently did not want or that nasty year 2012, throw it away.

What the poster wanted was only to omit the 5th month of 2012. That is “not (month=5 and year=2012)” as Toon so succinctly wrote.

Now I found it interesting that the 5th month of 2012 was the last month for which there were rows in the table at all. As an information processing issue, this looks to me as if you want all the data except the last month (which is perhaps still in process or as yet not audited). If that is indeed accurate, then

where year < 2012 or month < 5

would fulfill the logical requirement and this would helpfully lead to re-usable code

where year < &incomplete_year or month < &incomplete_month

which is a little more clear (if in fact my surmise of the purpose is correct which we’ll consider true for the rest of this posting.) But this still requires logic on two columns when in fact what is probably wanted is simply the data before 2012-05. (yyyy-mm format).

Now this can be rendered as

where to_date(to_char(year,'FM0000')||to_char(month,'FM00'),'yyyymm') <
to_date('&incomplete_month','yyyymm')

which makes it clear that this is really a range query, if you can discern that fact through all the formatting used to shove the data cleanly through the to_char and to_date functions.

Now this points out one of the benefits of using time value columns in the first place, so if that is an operationally practical solution you can fix the data model. If it is not operationally practical to retool the data model, Oracle has provided us with a virtual column capability. You won’t have to see the sausage getting made in your routine reading of queries, and you can put a helpful comment on the table in case someone might want to read in text what your intention was for the meaning of the virtual column.

In this case,
alter table sometable add year_month as (to_date(to_char(year,'FM0000')||to_char(month,'FM00'),'yyyymm'));
comment on column sometable.year_month is 'year and month columns combined to produce a valid date';
does the trick the where clause becomes really simple:
where year_month < to_date('&incomplete_month','yyyymm')

This makes the meaning of the query trivial to understand, and as the volume of data grows and perhaps older data is no longer of interest predicates of the form

where year_month > to_date('&older_than_relevant','yyyymm')
and year_month < to_date('&incomplete_month','yyyymm')

or
where year_month between to_date('&older_than_relevant','yyyymm') and to_date('&incomplete_month','yyyymm')
if you prefer between syntax neatly does the trick.
Since you can index a virtual column and partition on it as well, this then becomes a real opportunity for reducing the amount of work the computer will need to do to answer your query. With an index and no partitioning, there is a good chance it will be cheaper to access the data you need via the index. That will depend on the details. With partitioning you may get partition pruning and the nature of the date partitioning likely means you can set it up to scan whole partitions which should dovetail nicely with parallel query and exadata optimizations. (That bit about exadata is speculation. Oracle still hasn’t sent me one to play with.)

So I’ve gone well beyond the subject of the oracle-l question which was already answered. That’s what a few friends keep telling me I should use a blog for and it is slowly sinking in.

Advertisements

About rsiz

Father and Husband, Oracle Technology Scientist and Consultant, planning to end poverty for citizens and legal US residents Lebanon, NH · http://www.rsiz.com See my wife's puzzles at thingamajigsaw.com
This entry was posted in Oracle. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s