首頁 > 軟體

R語言UpSet包實現集合視覺化範例詳解

2022-06-24 14:06:15

前言

介紹一個R包UpSetR,專門用來集合視覺化,當多集合的韋恩圖不容易看的時候,就是它大展身手的時候了。

一、R包及資料

#安裝及載入R包
#install.packages("UpSetR")
library(UpSetR) 
#載入資料集
data <- read.csv("upSet.csv",header=TRUE)
#先大致瀏覽一下該資料集,資料集太長,就只看前幾列
head(data[,1:6],6)
#View(data) #彈出視窗,可檢視資料

二、upset()函數

使用UpsetR包裡面的upset()函數繪製集合視覺化圖形。

1)基本引數

upset(data, 
sets = c("Action", "Adventure", "Comedy", "Drama", "Fantasy" , "Children","Crime"),#檢視特定的幾個集合
 mb.ratio = c(0.55, 0.45),#控制上方條形圖以及下方點圖的比例
 order.by = "freq", #如何排序,這裡freq表示從大到小排序展示
 keep.order = TRUE, #keep.order按照sets引數的順序排序
 number.angles = 30, #調整柱形圖上數位角度
 point.size = 2, line.size = 1, #點和線的大小
 mainbar.y.label = "Genre Intersections", sets.x.label = "Movies Per Genre", #座標軸名稱
 text.scale = c(1.3, 1.3, 1, 1, 1.5, 1)) #六個數位,分別控制c(intersection size title, intersection size tick labels, set size title, set size tick labels, set names, numbers above bars)

2)queries引數

queries引數分為四個部分:query, param, color, active;

query: 指定哪個query,UpSetR有內建,也可以自定義;

param: list, query作用於哪個交集

color:每個query都是一個list,裡面可以設定顏色,沒設定的話將呼叫包裡預設的調色盤;

active:被指定的條形圖:TRUE顯示顏色,FALSE在條形圖頂端顯示三角形;

upset(data, main.bar.color = "black", 
queries = list(list(query = intersects,   #UpSetR 內建的intersects query
params = list("Drama"), ##指定作用的交集
color = "red", ##設定顏色,未設定會呼叫預設調色盤
active = F,   # TRUE:條形圖被顏色覆蓋,FALSE:條形圖頂端顯示三角形
query.name = "Drama"), # 新增query圖例
list(query = intersects,  params = list("Action", "Drama"), active = T,query.name = "Emotional action"), 
list(query = intersects,  params = list("Drama", "Comedy", "Action"), color = "orange", active = T)),query.legend = "top")

3)attribute.plots引數

新增屬性圖,內建有柱形圖、散點圖、熱圖等

3.1 新增柱形圖和散點圖

upset(data, main.bar.color = "black", 
queries = list(list(query = intersects, params = list("Drama"), color = "red", 
active = F,  query.name = "Drama"),
list(query = intersects,  params = list("Action", "Drama"), active = T,query.name = "Emotional action"), 
list(query = intersects,  params = list("Drama", "Comedy", "Action"), color = "orange", active = T)),  
attribute.plots = list(gridrows = 45, #新增屬性圖
plots = list(
list(plot = scatter_plot, #散點圖 
x = "ReleaseDate", y = "AvgRating", #橫縱軸的變數
queries = T), #T 則顯示出上面queries定義的顏色
list(plot = histogram, x = "ReleaseDate", queries = F)), 
ncols = 2), # 新增的圖分兩列
query.legend = "top") #query圖例在最上方

3.2 新增箱線圖

每次最多新增兩個箱線圖

upset(movies, boxplot.summary = c("AvgRating", "ReleaseDate")) 

3.3 新增密度曲線圖

因預設屬性圖中沒有密度曲線,需要自定義plot函數

#自定義密度曲線
another.plot &lt;- function(data, x, y) {
    data$decades &lt;- round_any(as.integer(unlist(data[y])), 10, ceiling)
    data &lt;- data[which(data$decades &gt;= 1970), ]
    myplot &lt;- (ggplot(data, aes_string(x = x)) + geom_density(aes(fill = factor(decades)), 
        alpha = 0.4) + theme(plot.margin = unit(c(0, 0, 0, 0), "cm"), legend.key.size = unit(0.4, "cm")))
}
upset(data, main.bar.color = "black", mb.ratio = c(0.5, 0.5), queries = list(list(query = intersects, 
    params = list("Drama"), color = "red", active = F), list(query = intersects, 
    params = list("Action", "Drama"), active = T), list(query = intersects, 
    params = list("Drama", "Comedy", "Action"), color = "orange", active = T)), 
    attribute.plots = list(gridrows = 50, plots = list(list(plot = histogram, 
        x = "ReleaseDate", queries = F), list(plot = scatter_plot, x = "ReleaseDate", 
        y = "AvgRating", queries = T), list(plot = another.plot, x = "AvgRating", 
        y = "ReleaseDate", queries = F)), ncols = 3))

參考

R語言視覺化

以上就是R語言UpSet包實現集合視覺化範例詳解的詳細內容,更多關於R語言UpSet包集合視覺化的資料請關注it145.com其它相關文章!


IT145.com E-mail:sddin#qq.com