里面有一个库可以生成可交互的词云，非常牛逼！

参考资料

超燃的文字云效果，用Python就能轻松get！

stylecloud

pypi
github

安装

pip3 install stylecloud

stylecloud具有以下特点：

为词云提供（任意大小）的图标形状（通过 Font Awesome 5.11.2 获得）
支持高级调色板（通过 palettable 实现）
为上述调色板提供直接梯度
支持读取文本文件，或预生成的 CSV 文件（包含单词和数字）
提供命令行接口

蒙版图片

影响词云颜值的问题之一就是蒙版图片的生成。

自己制作的蒙版图片要么分辨率不统一，要么需要调整对比度，比较麻烦，stylecloud是直接使用 Font Awesome 这个现成的方案。

网址链接

在 stylecloud\static 的文件夹下，有一个 fontawesome.min 的 css 文件包含了大量的图标，打开查看里面的内容，发现其中包含很多图标的代码。

这种 css 层叠样式表，咱也看不懂、也不知道咋用呀，多亏有中文网站分门别类罗列了图标的样子和名字。

中文网址

里面有详细的图标介绍和分类。

基本用法

基本用法

您可以使用样式前缀和图标名称将“ Font Awesome ”图标放置在几乎任何地方。我们努力达到这一目标，以使图标具有这些特征并自然地与文本显示在一起。

Font Awesome旨在与内联元素一起使用，我们建议使用一致的HTML元素以在项目中引用它们。我们喜欢<i>标签的简洁性，因为最近大多数人都在使用<em> </em>来强调/斜体化语义文本。如果那不是您的理想之选，那么在语义上使用<span>更为正确。

您需要知道两点信息才能引用一个图标：

以 fa- 为前缀的名称；
要使用的相应前缀的样式。

<div></div>
<i class="fas fa-camera"></i> <!-- this icon's 1) style prefix == fas and 2) icon name == camera -->
<i class="fas fa-camera"></i> <!-- using an <i> element to reference the icon -->
<span class="fas fa-camera"></span> <!-- using a <span> element to reference the icon -->

比如要使用苹果商标的蒙版图片，样式前缀 fab，以 fa-为前缀的名称 fa-apple，设置icon_name参数，icon_name='fab fa-apple'即可。

配色

配色是影响词云颜值的又一大问题。stylecloud同样找到了比较好的方案，配色方案使用高级调色板 palettable 来实现。

palettable 网站

我们可以通过修改参数 palette='配色方案' 来达到更改自己的词云配色。

绘制词云

gen_stylecloud()

主要参数如下：

text：输入文本，最好在直接调用函数时使用。
file_path：输入文本/CSV 的文件路径
icon_name：stylecloud 形状的图标名称（如 fas fa-grin-beam)，[default: fas fa-flag]
palette：控制调色方案，stylecloud的调色方案调用了palettable，这是一个非常实用的模块，其内部收集了数量惊人的大量的经典调色方案，默认为 cartocolors.qualitative.Bold_5
output_name：stylecloud 的输出文本名。[default: stylecloud.png]
gradient：梯度方向，(其默认值是 None，如果它的值不是 None，则 stylecloud 使用了方向性梯度）[default: None]
size：控制输出图像文件的分辨率（因为stylecloud默认输出方形图片，所以size传入的单个整数代表长和宽)，默认为512
font_path：stylecloud 所用字体 .ttf 文件的路径。[default: uses included Staatliches font]
random_state：控制单词和颜色的随机状态
background_color：字符串，控制词云图底色，可传入颜色名称或16进制色彩，默认为 white
max_font_size：stylecloud 中的最大字号 [default: 200]
max_words：stylecloud 可包含的最大单词数 [default: 2000]
stopwords：bool型，控制是否开启去停用词功能，默认为True，调用自带的英文停用词表
custom_stopwords：传入自定义的停用词List，配合stopwords共同使用

代码如下

# -*- coding: UTF-8 -*-
"""
@Author  ：叶庭云
@CSDN    ：https://yetingyun.blog.csdn.net/
"""
from stylecloud import gen_stylecloud
import jieba
import re
import random

# 读取数据
with open('datas.txt', encoding='utf-8') as f:
    data = f.read()

# 文本预处理  去除一些无用的字符   只提取出中文出来
new_data = re.findall('[\u4e00-\u9fa5]+', data, re.S)
new_data = "/".join(new_data)

# 文本分词
seg_list_exact = jieba.cut(new_data, cut_all=True)

result_list = []
with open('stop_words.txt', encoding='utf-8') as f:
    con = f.readlines()
    stop_words = set()
    for i in con:
        i = i.replace("\n", "")   # 去掉读取每一行数据的\n
        stop_words.add(i)

for word in seg_list_exact:
    # 设置停用词并去除单个词
    if word not in stop_words and len(word) > 1:
        result_list.append(word)
print(result_list)

# 个人推荐使用的palette配色方案  效果挺好看
# colorbrewer.qualitative.Dark2_7
# cartocolors.qualitative.Bold_5
# colorbrewer.qualitative.Set1_8

gen_stylecloud(
    text=' '.join(result_list),               # 文本数据
    size=600,                                 # 词云图大小
    font_path=r'‪C:\Windows\Fonts\msyh.ttc',   # 中文词云  显示需要设置字体
    output_name='词云.png',                   # 输出词云图名称
    icon_name='fas fa-grin-beam',             # 图标
    palette=cartocolors.qualitative.Bold_5    # 设置配色方案
)

运行效果

wordcloud

参考资料

github
macos 貌似用不成

安装

If you are using pip:

pip install wordcloud

If you are using conda, you can install from the conda-forge channel:

conda install -c conda-forge wordcloud

wordcloud库把词云当作一个WordCloud对象。

wordcloud.WordCloud() 代表一个文本对应的词云
可以根据文本中词语出现的频率等参数绘制词云
绘制词云的形状，尺寸和颜色都可以设定

配置对象参数。

参数	描述	例子
width	指定词云对象生成图片的宽度，默认400像素	w=wordcloud.WordCloud(width=600)
height	指定词云对象生成图片的高度，默认200像素	w=wordcloud.WordCloud(height=400)
min_font_size	指定词云中字体的最小字号，默认4号	w=wordcloud.WordCloud(min_font_size=10)
max_font_size	指定词云中字体的最大字号，根据高度自动调节	w=wordcloud.WordCloud(max_font_size=20)
font_step	指定词云中字体字号的步进间隔，默认为1	w=wordcloud.WordCloud(font_step=2)
font_path	指定文件字体的路径，默认None	w=wordcloud.WordCloud(font_path=”msyh.ttc”)
max_words	指定词云显示的最大单词数量，默认200	w=wordcloud.WordCloud(font_step=2)
stop_words	指定词云的排除词列表，即不显示的单词列表	w=wordcloud.WordCloud(stop_words={“Python”})
mask	指定词云形状，默认为长方形，需要应用imread()函数	from scipy.misc import imread mk=imread(“pic.png”) w=wordcloud.WordCloud(mask=mk)
background_color	指定词云图片的背景颜色，默认为黑色	w=wordcloud.WordCloud(background_color=”white”)

代码实现：

# -*- coding: UTF-8 -*-
"""
@Author  ：叶庭云
@CSDN    ：https://yetingyun.blog.csdn.net/
"""
import jieba
import collections
import re
from wordcloud import WordCloud
import matplotlib.pyplot as plt


# 958条评论数据
with open('data.txt') as f:
    data = f.read()

# 文本预处理  去除一些无用的字符   只提取出中文出来
new_data = re.findall('[\u4e00-\u9fa5]+', data, re.S)
new_data = " ".join(new_data)

# 文本分词
seg_list_exact = jieba.cut(new_data, cut_all=True)

result_list = []
with open('stop_words.txt', encoding='utf-8') as f:
    con = f.readlines()
    stop_words = set()
    for i in con:
        i = i.replace("\n", "")   # 去掉读取每一行数据的\n
        stop_words.add(i)

for word in seg_list_exact:
    # 设置停用词并去除单个词
    if word not in stop_words and len(word) > 1:
        result_list.append(word)
print(result_list)

# 筛选后统计
word_counts = collections.Counter(result_list)
# 获取前100最高频的词
word_counts_top100 = word_counts.most_common(100)
print(word_counts_top100)

# 绘制词云
my_cloud = WordCloud(
    background_color='white',  # 设置背景颜色  默认是black
    width=900, height=600,
    max_words=100,            # 词云显示的最大词语数量
    font_path='simhei.ttf',   # 设置字体  显示中文
    max_font_size=99,         # 设置字体最大值
    min_font_size=16,         # 设置子图最小值
    random_state=50           # 设置随机生成状态，即多少种配色方案
).generate_from_frequencies(word_counts)

# 显示生成的词云图片
plt.imshow(my_cloud, interpolation='bilinear')
# 显示设置词云图中无坐标轴
plt.axis('off')
plt.show()

词云图：

pyecharts库的WordCloud绘制词云

pyecharts是基于echarts的python库，能够绘制多种交互式图表，和其他可视化库不一样，pyecharts支持链式调用。

也就是说添加图表元素、修改图表配置，只需要简单的调用组件即可。

# class pyecharts.charts.WordCloud
class WordCloud(
    # 初始化配置项，参考 `global_options.InitOpts`
    init_opts: opts.InitOpts = opts.InitOpts()
)

# func pyecharts.charts.WordCloud.add
def add(
    # 系列名称，用于 tooltip 的显示，legend 的图例筛选。
    series_name: str,

    # 系列数据项，[(word1, count1), (word2, count2)]
    data_pair: Sequence,

    # 词云图轮廓，有 'circle', 'cardioid', 'diamond', 'triangle-forward', 'triangle', 'pentagon', 'star' 可选
    shape: str = "circle",

    # 自定义的图片（目前支持 jpg, jpeg, png, ico 的格式，其他的图片格式待测试）
    # 该参数支持：
    # 1、 base64 （需要补充 data 头）；
    # 2、本地文件路径（相对或者绝对路径都可以）
    # 注：如果使用了 mask_image 之后第一次渲染会出现空白的情况，再刷新一次就可以了（Echarts 的问题）
    # Echarts Issue: https://github.com/ecomfe/echarts-wordcloud/issues/74
    mask_image: types.Optional[str] = None,

    # 单词间隔
    word_gap: Numeric = 20,

    # 单词字体大小范围
    word_size_range=None,

    # 旋转单词角度
    rotate_step: Numeric = 45,

    # 距离左侧的距离
    pos_left: types.Optional[str] = None,

    # 距离顶部的距离
    pos_top: types.Optional[str] = None,

    # 距离右侧的距离
    pos_right: types.Optional[str] = None,

    # 距离底部的距离
    pos_bottom: types.Optional[str] = None,

    # 词云图的宽度
    width: types.Optional[str] = None,

    # 词云图的高度
    height: types.Optional[str] = None,

    # 允许词云图的数据展示在画布范围之外
    is_draw_out_of_bound: bool = False,

    # 提示框组件配置项，参考 `series_options.TooltipOpts`
    tooltip_opts: Union[opts.TooltipOpts, dict, None] = None,

    # 词云图文字的配置
    textstyle_opts: types.TextStyle = None,

    # 词云图文字阴影的范围
    emphasis_shadow_blur: types.Optional[types.Numeric] = None,

    # 词云图文字阴影的颜色
    emphasis_shadow_color: types.Optional[str] = None,
)

代码实现

# -*- coding: UTF-8 -*-
"""
@Author  ：叶庭云
@CSDN    ：https://yetingyun.blog.csdn.net/
"""
import jieba
import collections
import re
from pyecharts.charts import WordCloud
from pyecharts.globals import SymbolType
from pyecharts import options as opts
from pyecharts.globals import ThemeType, CurrentConfig


CurrentConfig.ONLINE_HOST = 'D:/python/pyecharts-assets-master/assets/'

# 958条评论数据
with open('data.txt') as f:
    data = f.read()

# 文本预处理  去除一些无用的字符   只提取出中文出来
new_data = re.findall('[\u4e00-\u9fa5]+', data, re.S)  # 只要字符串中的中文
new_data = " ".join(new_data)

# 文本分词--精确模式分词
seg_list_exact = jieba.cut(new_data, cut_all=True)

result_list = []
with open('stop_words.txt', encoding='utf-8') as f:
    con = f.readlines()
    stop_words = set()
    for i in con:
        i = i.replace("\n", "")   # 去掉读取每一行数据的\n
        stop_words.add(i)

for word in seg_list_exact:
    # 设置停用词并去除单个词
    if word not in stop_words and len(word) > 1:
        result_list.append(word)
print(result_list)


# 筛选后统计
word_counts = collections.Counter(result_list)
# 获取前100最高频的词
word_counts_top100 = word_counts.most_common(100)
# 可以打印出来看看统计的词频
print(word_counts_top100)

word1 = WordCloud(init_opts=opts.InitOpts(width='1350px', height='750px', theme=ThemeType.MACARONS))
word1.add('词频', data_pair=word_counts_top100,
          word_size_range=[15, 108], textstyle_opts=opts.TextStyleOpts(font_family='cursive'),
          shape=SymbolType.DIAMOND)
word1.set_global_opts(title_opts=opts.TitleOpts('商品评论词云图'),
                      toolbox_opts=opts.ToolboxOpts(is_show=True, orient='vertical'),
                      tooltip_opts=opts.TooltipOpts(is_show=True, background_color='red', border_color='yellow'))
word1.render("商品评论词云图.html")

词云图：

用 pyecharts 绘制的词云图渲染在网页上，具有交互效果，还有很多的配置参数可以设置让词云图看起来更美观。