<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
零、需求介紹
現有一張表資料如下:
此表是一張映象表,policyno列代表一個保單號,state列代表這個保單號在snapdate當天的最後一次狀態(state每天可能會變很多次,映象表只保留snapdate時間點凌晨的最後一次狀態),snapdate代表當天做映象的時間,現在有個需求,我們想取出來這個保單號連續保持某個狀態的起止時間,例如:
保單號sm1保持狀態1的起止時間為2021020120210202,然後在20210203時候變成了狀態2,又在20210204時候變成了狀態3,最終又在2021020520210209時間段保持在狀態1,然後映象表的程式可能期間出現過問題,在20210210開始到20210215日沒有映象成功,直到20210216日才恢復,20210216~20210219日保單號sm1的狀態一直保持為1,後續還有可能繼續變,那麼,上面說的保單sm1的幾個狀態的連續時間,我們想要的結果為:
POLICYNO STATE START_DATE END_DATE sm1 1 20210201 20210202 sm1 2 20210203 20210203 sm1 3 20210204 20210204 sm1 1 20210205 20210209 sm1 1 20210216 20210219 .........................
我這裡提供5種寫法,可以歸結為兩大類:
一類:通過使用分析函數或自關聯獲取資料連續性,構造一個分組欄位進行分組求最大最小值。
二類:通過樹形層次查詢獲取連續性,獲取起止時間。
一、通過使用lag分析函數獲取前後時間,根據當前時間與前後時間的差值進行判斷獲取時間連續性標誌,然後使用sum()over()對連續性標誌進行累加,從而生成一個新的臨時分組欄位,最終根據policyno,state,臨時分組欄位進行分組取最大最小值
這裡為了好理解,每一個處理步驟都單獨寫出來了,實際使用中可以簡寫一下:
with t as--求出來每條資料當天的前一天映象時間 (select a.policyno, a.state, a.snapdate, lag(a.snapdate) over(partition by a.policyno, a.state order by a.snapdate) as lag_tim from zyd.temp_0430 a order by a.policyno, a.snapdate), t1 as--判斷當天映象時間和前一天的映象時間+1是否相等,如果相等就置為0否則置為1,新增臨時欄位lxzt意為:連續狀態標誌 (select t.*, case when t.snapdate = t.lag_tim + 1 then 0 else 1 end as lxzt from t order by policyno, snapdate), t2 as--根據lxzt欄位進行sum()over()求和,求出來一個新的用來做分組依據的欄位,簡稱fzyj (select t1.*, sum(lxzt) over(order by policyno, snapdate) as fzyj from t1) select policyno,--最後根據policyno,state,fzyj進行分組求最大最小值即為狀態連續的開始結束時間 state, -- fzyj, min(snapdate) as start_snap, max(snapdate) as end_snap from t2 group by policyno, state, fzyj order by fzyj;
二、不使用lag分析函數,通過自關聯也能判斷出來哪些天連續,然後後面操作步驟同上,這個寫法算是對lag()over()函數的一個回寫,擺脫對分析函數的依賴
下面這種寫法,需要讀兩次表,上面lag的方式是對這個寫法的一種優化:
with t as (select a.policyno, a.state, a.snapdate, b.snapdate as snap2 from zyd.temp_0430 a, zyd.temp_0430 b where a.policyno = b.policyno(+) and a.state = b.state(+) and a.snapdate - 1 = b.snapdate(+) order by policyno, snapdate), t1 as (select t.*, case when snap2 is null then 1 else 0 end as lxzt from t order by policyno, snapdate), t2 as (select t1.*, sum(lxzt) over(order by policyno, snapdate) as fzyj from t1 order by policyno, snapdate) select policyno, state, fzyj, min(snapdate) as start_snap, max(snapdate) as end_snap from t2 group by policyno, state, fzyj order by fzyj;
三、通過構造樹形結構,確定根節點和葉子節點來獲取狀態連續的開始和結束時間
先按照資料的連續性構造顯示每層關係的樹狀結構:
with t as (select a.policyno, a.state, a.snapdate, lag(a.snapdate) over(partition by a.policyno, a.state order by a.snapdate) as lag_tim from zyd.temp_0430 a --where policyno='sm1' order by a.policyno, a.snapdate), t1 as (select t.*, case when t.snapdate = t.lag_tim + 1 then 0 else 1 end as lxzt from t order by policyno, snapdate), t2 as (select t1.*, lpad('->', (level - 1) * 2, '->') || snapdate as 樹狀結構, level as 樹中層次, decode(level, 1, 1) 是否根節點, decode(connect_by_isleaf, 1, 1) 是否葉子節點, case when (connect_by_isleaf = 0 and level > 1) then 1 end 是否樹杈, (prior snapdate) as 根值, connect_by_root snapdate 主根值 from t1 start with (lxzt = 1) connect by (prior snapdate = snapdate - 1 and prior state = state and prior policyno = policyno) order by policyno, snapdate) select * from t2;
從上面能清晰的看出來,每一次連續狀態的開始日期作為每個樹的根,分支節點即樹杈和葉子節點的關係一步步拓展開來,分析上面資料我們能夠知道,如果我們想要獲取每個保單狀態連續時間範圍,以上面的資料現有分佈方式,現在就可以:通過policyno,state,主根值進行group by 取snapdate的最大最小值,類似前面兩個寫法的最終步驟;
接下來,我們這個第三種寫法就是按照這個方式寫:
with t as (select a.policyno, a.state, a.snapdate, lag(a.snapdate) over(partition by a.policyno, a.state order by a.snapdate) as lag_tim from zyd.temp_0430 a --where policyno='sm1' order by a.policyno, a.snapdate), t1 as (select t.*, case when t.snapdate = t.lag_tim + 1 then 0 else 1 end as lxzt from t order by policyno, snapdate), t2 as (select t1.*, lpad('->', (level - 1) * 2, '->') || snapdate as 樹狀結構, level as 樹中層次, decode(level, 1, 1) 是否根節點, decode(connect_by_isleaf, 1, 1) 是否葉子節點, case when (connect_by_isleaf = 0 and level > 1) then 1 end 是否樹杈, (prior snapdate) as 根值, connect_by_root snapdate 主根值 from t1 start with (lxzt = 1) connect by (prior snapdate = snapdate - 1 and prior state = state and prior policyno = policyno) order by policyno, snapdate) select policyno, state, min(snapdate) as start_date, max(snapdate) as end_date from t2 group by policyno, state, 主根值 order by policyno, state;
四、參照過程三,既然已經獲取了每條資料的主根值和葉子節點的值,這就代表了我們知道了每個保單狀態的連續開始和結束時間,那直接取出來葉子節點資料,葉子節點主根值就是開始日期,葉子節點的值就是結束日期,這樣我們就不需再group by了
with t as (select a.policyno, a.state, a.snapdate, lag(a.snapdate) over(partition by a.policyno, a.state order by a.snapdate) as lag_tim from zyd.temp_0430 a --where policyno='sm1' order by a.policyno, a.snapdate), t1 as (select t.*, case when t.snapdate = t.lag_tim + 1 then 0 else 1 end as lxzt from t order by policyno, snapdate), t2 as (select t1.*, lpad('->', (level - 1) * 2, '->') || snapdate as 樹狀結構, level as 樹中層次, decode(level, 1, 1) 是否根節點, decode(connect_by_isleaf, 1, 1) 是否葉子節點, case when (connect_by_isleaf = 0 and level > 1) then 1 end 是否樹杈, (prior snapdate) as 根值, connect_by_root snapdate 主根值 from t1 start with (lxzt = 1) connect by (prior snapdate = snapdate - 1 and prior state = state and prior policyno = policyno) order by policyno, snapdate) select policyno, state, 主根值 as start_date, snapdate as end_date from t2 where 是否葉子節點 = 1 order by policyno, snapdate
五、在Oracle10g之前,上面樹狀查詢的關鍵函數 connect_by_root還不支援,如果使用樹形結構,可以通過sys_connect_by_path來實現
with t as (select a.policyno, a.state, a.snapdate, lag(a.snapdate) over(partition by a.policyno, a.state order by a.snapdate) as lag_tim --case when lag(a.snapdate) over(partition by a.policyno, a.state order by a.snapdate) is null then snapdate else lag(a.snapdate) over(partition by a.policyno, a.state order by a.snapdate) end as lag_tim from zyd.temp_0430 a order by a.policyno, a.snapdate), t1 as (select t.*, case when t.snapdate = t.lag_tim + 1 then 0 else 1 end as lxzt from t order by policyno, snapdate), t2 as (select t1.*, sys_connect_by_path(snapdate, ',') as pt, level, connect_by_isleaf as cb from t1 start with (lxzt = 1) connect by (prior snapdate = snapdate - 1 and prior state = state and prior policyno = policyno)) select t2.*, regexp_substr(pt, '[^,]+', 1, 1) as start_date, regexp_substr(pt, '[^,]+', 1, regexp_count(pt, ',')) as end_date from t2 where cb = 1 order by policyno, state;
還有好多其他寫法,這裡不再一一列舉!
總結
到此這篇關於Oracle數倉中判斷時間連續性的幾種SQL寫法的文章就介紹到這了,更多相關Oracle數倉判斷時間連續性內容請搜尋it145.com以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援it145.com!
相關文章
<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
综合看Anker超能充系列的性价比很高,并且与不仅和iPhone12/苹果<em>Mac</em>Book很配,而且适合多设备充电需求的日常使用或差旅场景,不管是安卓还是Switch同样也能用得上它,希望这次分享能给准备购入充电器的小伙伴们有所
2021-06-01 09:31:42
除了L4WUDU与吴亦凡已经多次共事,成为了明面上的厂牌成员,吴亦凡还曾带领20XXCLUB全队参加2020年的一场音乐节,这也是20XXCLUB首次全员合照,王嗣尧Turbo、陈彦希Regi、<em>Mac</em> Ova Seas、林渝植等人全部出场。然而让
2021-06-01 09:31:34
目前应用IPFS的机构:1 谷歌<em>浏览器</em>支持IPFS分布式协议 2 万维网 (历史档案博物馆)数据库 3 火狐<em>浏览器</em>支持 IPFS分布式协议 4 EOS 等数字货币数据存储 5 美国国会图书馆,历史资料永久保存在 IPFS 6 加
2021-06-01 09:31:24
开拓者的车机是兼容苹果和<em>安卓</em>,虽然我不怎么用,但确实兼顾了我家人的很多需求:副驾的门板还配有解锁开关,有的时候老婆开车,下车的时候偶尔会忘记解锁,我在副驾驶可以自己开门:第二排设计很好,不仅配置了一个很大的
2021-06-01 09:30:48
不仅是<em>安卓</em>手机,苹果手机的降价力度也是前所未有了,iPhone12也“跳水价”了,发布价是6799元,如今已经跌至5308元,降价幅度超过1400元,最新定价确认了。iPhone12是苹果首款5G手机,同时也是全球首款5nm芯片的智能机,它
2021-06-01 09:30:45