写在前面
18.1 修改标尺
18.1.1 自定义坐标轴
ggplot2中,函数scale_x_* 和 scale_y_* 控制图形的x y轴,其中*指定标尺类型。主要有以下几种:
18.1.1.1 自定义连续型变量的坐标轴
在函数scale_*_continuous 中的一些常用选项包括:
18.1.1.1.1 举例
用数据集mtcars,绘制重量(wt)与汽车的燃油效率(mpg)散点图:
1.简单版
library(ggplot2)
ggplot(data=mtcars,aes(x=wt,y=mpg))+
geom_point()+
labs(title = "Fuel efficiency by car weight")
2.修改版
进行如下修改:
1.对于重量(wt):添加轴标签”Weight(1000lbs)“,标尺范围设为1.5到5.5,使用10个主刻度线,不显示次刻度线;
2.对于每加仑汽油行驶英里数(mpg):添加轴标签”Miles per gallon”,标尺范围设为10-35,将主刻度线设定在10、15、20、25、30和35,以1加仑为单位绘制次刻度线。
library(ggplot2)
ggplot(data=mtcars,aes(x=wt,y=mpg))+
geom_point()+
labs(title = "Fuel efficiency by car weight")+
scale_x_continuous(name = "Weight(1000lbs)",
n.breaks=10,
limits = c(1.5,5.5),
minor_breaks = NULL)+
scale_y_continuous(name = "Miles per gallon",
limits= c(10,35),
minor_breaks=seq(10,35,1))
18.1.1.2 自定义分类变量的坐标轴
ggplot2中常用的scale_*_discrete选项如下:
参数 | 描述 |
---|---|
name | 标尺名称,与函数labs(x=,y=)等同 |
breaks | 刻度的数字向量 |
limits | 定义标尺及其顺序的值的字符向量 |
labels | 提供标签的字符向量(必须与breaks参数的长度一致)使用labels=abbreviate可以缩短长标签 |
position | 坐标轴位置(y轴的左边/右边,x轴的上/下边),left or right for y axes, top or bottom for x axes. |
18.1.1.2.1 举例
用ISLR包中的Wage数据框进行举例,数据包含2011年收集的美国某地区3000名男性员工的工资和人口统计信息。绘制此数据样本中婚姻状态与受教育程度间的关系图。
1.简单绘图
library(ISLR)
## Warning: 程辑包'ISLR'是用R版本4.3.2 来建造的
##
## 载入程辑包:'ISLR'
## The following object is masked from 'package:vcd':
##
## Hitters
library(ggplot2)
data(Wage,package = "ISLR")
ggplot(data=Wage,aes(maritl,fill=education))+
geom_bar(position = "fill")+
labs(title = "Participant Education by Maritlr")
2.修改绘图
针对x轴:x轴名称修改为”Maritl”,将标签前面的编号去掉;
针对y轴:y轴名称修改为”Percent”,y轴使用百分比格式。
library(ISLR)
library(ggplot2)
library(scales)
head(Wage)
## year age maritl race education region
## 231655 2006 18 1. Never Married 1. White 1. < HS Grad 2. Middle Atlantic
## 86582 2004 24 1. Never Married 1. White 4. College Grad 2. Middle Atlantic
## 161300 2003 45 2. Married 1. White 3. Some College 2. Middle Atlantic
## 155159 2003 43 2. Married 3. Asian 4. College Grad 2. Middle Atlantic
## 11443 2005 50 4. Divorced 1. White 2. HS Grad 2. Middle Atlantic
## 376662 2008 54 2. Married 1. White 4. College Grad 2. Middle Atlantic
## jobclass health health_ins logwage wage
## 231655 1. Industrial 1. <=Good 2. No 4.318063 75.04315
## 86582 2. Information 2. >=Very Good 2. No 4.255273 70.47602
## 161300 1. Industrial 1. <=Good 1. Yes 4.875061 130.98218
## 155159 2. Information 2. >=Very Good 1. Yes 5.041393 154.68529
## 11443 2. Information 1. <=Good 1. Yes 4.318063 75.04315
## 376662 2. Information 2. >=Very Good 1. Yes 4.845098 127.11574
data(Wage,package = "ISLR")
ggplot(data=Wage,aes(maritl,fill=education))+
geom_bar(position = "fill")+
labs(title = "Participant Education by Maritlr")+
scale_x_discrete(name="Maritl",
labels=c("Never Married","Married","Widowed","Divorced","Separated"))+
scale_y_continuous(name="Percent",
label= percent_format(accuracy=2),
n.breaks=10)
注意:垂直轴(y)代表的是数值型变量,因此需要用scale_y_continuous 而不是scale_y_discrete。
18.1.2 自定义颜色
ggplot2中scale_color_* ()函数用于点、线、边界和文本颜色设定, scale_fill_* ()函数用于带面积的形状对象的颜色填充。
常用的设定颜色标尺的函数有:
18.1.2.1 连续型调色板
依然用mtcars数据集举例,绘制了汽车重量(wt)和燃油效率(mpg)的关系图,通过映射发动机排量(disp)到点的颜色来添加第3个变量。
library(ggplot2)
p <- ggplot(mtcars,aes(x=wt,y=mpg,color=disp))+
geom_point(shape=19,size=3)+
scale_x_continuous(name = "Weight(1000 1bs.)",
n.breaks = 10,
minor_breaks = NULL,
limits = c(1.5,5.5))+
scale_y_continuous(name = "Mile per gallon",
breaks = seq(10,35,5),
minor_breaks = seq(10,35,1),
limits = c(10,35))
#设置颜色
p+ggtitle("A.Default color gradient")
p+scale_color_gradient(low = "grey",high = "black")+
ggtitle("B.Greyscale gradient")
p+scale_color_gradient(low = "red",high = "blue")+
ggtitle("C.Red-blue color Gradient")
p+scale_color_steps(low = "red",high = "blue")+
ggtitle("D.Red-blue binned color Gradient")
p+scale_color_steps2(low = "red",mid="white",high = "blue")+
ggtitle("E.Red-white-blue binned color Gradient")
p+scale_color_viridis_c(direction = -1)+
ggtitle("F.Viridis color gradient")
可以看出分桶与不分桶的区别是:不分桶的颜色变化是连续的,而分桶的颜色是具有梯度的。
18.1.2.2 分类型调色板
用ISLR包中的Wage数据框进行举例,数据包含2011年收集的美国某地区3000名男性员工的工资和人口统计信息。education是分类变量,映射到离散颜色。
library(ISLR)
library(ggplot2)
library(scales)
b <- ggplot(data=Wage,aes(maritl,fill=education))+
geom_bar(position = "fill")+
labs(title = "Participant Education by Maritlr")+
scale_x_discrete(name="Maritl",
labels=c("Never Married","Married","Widowed","Divorced","Separated"))+
scale_y_continuous(name="Percent",
label= percent_format(accuracy=2),
n.breaks=10)
#设置颜色
b+ggtitle("A.Default colors")
b+scale_fill_brewer(palette = "Set2")+
ggtitle("B.ColorBrewer Set2 palette")
b+scale_fill_viridis_d()+
ggtitle("C.Viridis color scheme")
b+scale_fill_manual(values = c("gold4","orange2","deepskyblue3","brown2","yellowgreen"))+
ggtitle("D.Manual color scheme")
注意:这里education属于数值型分类变量;另外,注意使用scale_fill_而非scale_color_
18.2 修改主题
ggplot2中 theme()可以用于自定义图形的非数据部分。具体参数可见?theme()
主图元素:
例如:将x y轴的标题设置为14点蓝色字体
library(ggplot2)
ggplot(mtcars,aes(x=wt,y=mpg))+
geom_point()+
theme(axis.title = element_text(size = 14,color = "blue"))
18.2.1 预置主题
在ggplot2中,有8个预设主题可供选择。分别是:
theme_gray(): 默认灰色主题,具有简洁的灰色背景和白色网格线。
theme_bw(): 黑白主题,具有白色背景和黑色网格线。
theme_minimal(): 极简主题,具有简洁的白色背景和无网格线
theme_classic(): 经典主题,具有白色背景、黑色xy边框。
theme_void(): 空白主题,没有背景、网格线和边框,只有数据图形。
theme_light(): 亮色主题,具有浅色的背景和网格线。
theme_dark(): 暗色主题,具有深色的背景和网格线。
theme_linedraw(): 线条绘制主题,具有简单的黑色线条和白色背景。
用实例看一下:
library(ggplot2)
p <- ggplot(mtcars,aes(x=wt,y=mpg))+
geom_point()
p+theme_grey()+ggtitle("theme_grey")
p+theme_bw()+ggtitle("theme_bw")
p+theme_minimal()+ggtitle("theme_minimal")
p+theme_classic()+ggtitle("theme_classic")
p+theme_void()+ggtitle("theme_void")
p+theme_light()+ggtitle("theme_light")
p+theme_dark()+ggtitle("theme_dark")
p+theme_linedraw()+ggtitle("theme_linedraw")
18.2.2 自定义字体
18.2.2.1 操作方法
a.下载本地/google字体
A.查找本地字体文件:
findfont <- function(x){
suppressMessages(require(showtext))
suppressMessages(require(dplyr))
filter(font_files(),grepl(x,family,ignore.case=TRUE))%>%
select(path,file,family,face)
}
findfont("comic")
## Warning: 程辑包'showtext'是用R版本4.3.2 来建造的
## Warning: 程辑包'sysfonts'是用R版本4.3.2 来建造的
## Warning: 程辑包'showtextdb'是用R版本4.3.2 来建造的
## Error in select(., path, file, family, face): 参数没有用(path, file, family, face)
将此文件加载到R,并使用自定义名称”comic”:
font_add("comic",regular = "comic.ttf",
bold = "comicbd.ttf")
B.goole字体下载方式:
font_add_google("name","family")
name:google字体的名称;family:自定义名称,后续代码中将用此名称来引用该字体。
例如:
font_add_google("Schoolbell","bell")
b.将showtext设为图形输出的设备
showtext_auto()
c.在函数ggplot2的theme()中指定字体
可以使用element_text()指定字体的系列、字形、大小、颜色和方向:theme(* =element_text()), *指与文本相关的theme()参数,包括:
参数 | 描述 |
---|---|
axis.title, axis.title.x, axis.title.y | 坐标轴标题 |
axis.text, axis.text.x, axis.text.y | 坐标轴上的刻度线标签 |
legend.text, legend.title | 图例项标签和图例标题 |
plot.title, plot.subtitle, plot.caption | 图例题,副标题和图的标注栏 |
strip.text, strip.text.x, strip.text.y | 分图标签 |
18.2.2.2 举例
library(extrafont)
## Registering fonts with R
##
## 载入程辑包:'extrafont'
## The following object is masked from 'package:showtextdb':
##
## font_install
library(sysfonts)
font_add("comic",regular = "comic.ttf",
bold = "comicbd.ttf")
font_add_google("Open Sans","Sans")
showtext_auto()
p <- ggplot(mtcars,aes(x=wt,y=mpg))+
geom_point()+
labs(title = "Fuel Efficiency by Car Weight")
p+theme(plot.title = element_text(family = "Sans",size=14),
axis.title = element_text(family = "comic"))
18.2.3 自定义图例
与图例有关的theme()参数:
18.2.3.1 举例
还是用mtcars数据集,wt为x轴,mpg为y轴,根据发动机气缸数量给点选择颜色。并进行图例修改:
将图例放置在图的右上角;添加图例的标题Cylinders”;横向列出图例类别;将图例背景设为浅灰色,并去除主要元素(带颜色的符号)周围的背景;给图例添加白色边框。
library(ggplot2)
p <- ggplot(mtcars,aes(x=wt,y=mpg,color=factor(cyl)))+
geom_point(size=3)+
scale_color_discrete(name="Cylinders")+
labs(title = "Fuel Efficiency for 32 Automobiles",
x="weight(1000 1bs",
y="miles per gallon")
p
p+theme(legend.position = c(0.95,0.95),
legend.justification = c(1,1),
legend.title = element_text("Cylinders"),
legend.background = element_rect(fill ="grey",
color = "white"),
legend.key =element_blank())
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Cylinders' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Cylinders' not found, will use 'sans' instead
18.2.4 自定义绘图区
theme()参数可以自定义绘图区,代码清单如下:
18.2.4.1 举例
在这个例子中:主网格线设置为灰色实线,次网格线设置为灰色虚线,带状标签背景设置为白色,并把位置放在上方。
library(ggplot2)
mtcars$am <- factor(mtcars$am,labels = c("Automatic","Manual"))
ggplot(mtcars,aes(x=disp,y=mpg))+
geom_point(aes(color=factor(cyl)),size=2)+
geom_smooth(method = "lm",
formula=y~x+I(x^2),linetype="dotted",se=FALSE)+
facet_wrap(~am,ncol=2)+
theme_bw()+
theme(strip.background = element_rect(fill="white"),
panel.grid.major = element_line(color = "lightgrey"),
panel.grid.minor = element_line(color = "lightgrey",
linetype = "dashed"),
legend.position = "top")
18.3 添加标注
添加标注参数如下:
18.3.1 给数据点添加标签
当观测值以点在图上显示时难以分辨不同的点代表的是哪个观测值,例如mtcars数据中以汽车重量(wt)和行驶英里数(mpg)之间关系图。
library(ggplot2)
ggplot(mtcars,aes(x=wt,y=mpg))+
geom_point(color="steelblue")
分别使用函数geom_text()和geom_label()对观测点进行添加标签:
library(ggplot2)
ggplot(mtcars,aes(x=wt,y=mpg))+
geom_point(color="steelblue")+
geom_text(label=row.names(mtcars))
library(ggplot2)
ggplot(mtcars,aes(x=wt,y=mpg))+
geom_point(color="steelblue")+
geom_label(label=row.names(mtcars))
文本重叠严重,下面使用geom_text_repel()和geom_label_repel()函数添加标注:
library(ggplot2)
library(ggrepel)
ggplot(mtcars,aes(x=wt,y=mpg))+
geom_point(color="steelblue")+
geom_text_repel(label=row.names(mtcars))
library(ggplot2)
ggplot(mtcars,aes(x=wt,y=mpg))+
geom_point(color="steelblue")+
geom_label_repel(label=row.names(mtcars))
结果显示文本重叠的问题被解决了。
18.3.2 给条形添加标签
1.常规条形图
用ISLR包中的Wage数据集举例,画出该数据集中不同婚姻状态的百分比。
先计算出百分比:
library(ISLR)
library(dplyr)
plotdata <- Wage %>%
group_by(maritl) %>%
summarise(n=n()) %>%
mutate(pct=n/sum(n),
lbls=scales::percent(pct))
plotdata
## # A tibble: 5 × 4
## maritl n pct lbls
## <fct> <int> <dbl> <chr>
## 1 1. Never Married 648 0.216 21.6%
## 2 2. Married 2074 0.691 69.1%
## 3 3. Widowed 19 0.00633 0.6%
## 4 4. Divorced 204 0.068 6.8%
## 5 5. Separated 55 0.0183 1.8%
作图:
注意,这里stat=“identity”是必须,以避免x y 美学投射报错;因为stat默认是count。
柱状图中的geom_text则是主要label(标签内容)、vjust(标签位置)和size(标签大小)的设置。
library(ggplot2)
ggplot(plotdata,aes(x=maritl,y=pct))+
geom_bar(stat = "identity",fill="steelblue")+
geom_text(aes(label=lbls),
vjust=-0.5,size=3)+
theme_bw()
18.4 图形组合
使用patchwork包进行,首先需要将待组合的单图分别保存为单独的对象并命名,然后直接用|以及/来拼图,A|B:将图A和B左右组合,A/B:将图A和B上下组合;并可以使用单括号()创建图形的子组,例如:(A|B)/(C|D):就是将ABCD图组合为左上A,右上B,左下C和右下D。
举例:
library(ggplot2)
library(patchwork)
##
## 载入程辑包:'patchwork'
## The following object is masked from 'package:MASS':
##
## area
P1 <- ggplot(mtcars,aes(disp,mpg))+
geom_point()
P2 <- ggplot(mtcars,aes(factor(cyl),mpg))+
geom_boxplot()
P3 <- ggplot(mtcars,aes(mpg))+
geom_histogram(bins = 8,color="white")
(P1|P2)/P3
完整教程请查看
如何联系我们