《KVM私有云架构设计与实践》目录

书籍目前已出版,购买链接在下文提供,也可直接联系作者购买,微信lofyer_org。

当当
亚马逊
京东

前言 1
第1章私有云行业现状 8
1.1 私有云概念 8
1.2 国内私有云企业与行业客户 9
1.2.1 行业垂直细分 9
1.2.2 落地场景 11
1.3 总结 23
第2章基础架构设计 24
2.1 基本架构原则 24
2.1.1 合理的存储配置 25
2.1.2 稳定的网络基础 32
2.1.3 可靠的计算资源 37
2.2 架构安全 42
2.2.1 认证与授权 42
2.2.2 服务安全 44
2.3 “云”化架构 48
2.3.1 池化资源 48
2.3.2 SLA管理 49
2.4 OpenStack基础架构设计实例 57
2.5 总结 63
第3章软件架构设计 64
3.1 开源云架构概览 65
3.1.1 混合云——OpenStack 65
3.1.2 IaaS——oVirt 68
3.1.3 PaaS——OpenShift 69
3.2 集群架构与软件设计原则 70
3.2.1 集群架构 70
3.2.2 控制单元与服务代理 74
3.2.3 平台设计 78
3.3 服务实现 84
3.3.1 服务要素 85
3.3.2 框架示例 86
3.3.3 IaaS扩展 86
3.3.4 PaaS扩展 87
3.4 PaaS平台部署示例——OpenShift 88
3.5 总结 91
第4章 KVM虚拟化基础 92
4.1 QEMU 92
4.1.1 QEMU/KVM简介 92
4.1.2 机器模型 106
4.1.3 设备清单 111
4.1.4 QEMU控制台 122
4.1.5 QAPI 127
4.2 Libvirt 133
4.2.1 基本概念与用例 133
4.2.2 对象描述方法 142
4.2.3 Virsh控制台 159
4.2.4 编程示例 163
4.3 快速入门 168
4.3.1 搭建VirtManager 169
4.3.2 学习建议 173
第5章容器技术基础 175
5.1 容器简介 175
5.1.1 发展历史 175
5.1.2 技术实现 176
5.2 Docker 184
5.2.1 基本架构 184
5.2.2 主要元素 186
5.2.3 周边工具 228
5.3 安全隐患与对应措施 231
第6章 私有云网络基础 233
6.1 网络模型关键字 233
6.2 经典虚拟化网络 239
6.2.1 桥接网络 239
6.2.2 NAT网络 242
6.2.3 VLAN网络 246
6.3 软件定义网络 254
6.3.1 技术基础 256
6.3.2 虚拟化组网示例 300
第7章 私有云存储基础 309
7.1 存储基本元素 309
7.1.1 VFS 311
7.1.2 文件系统 313
7.1.3 块设备 320
7.2 虚拟机硬盘存储 322
7.2.1 虚拟硬盘 323
7.2.2 无状态存储 340
7.2.3 存储池 341
7.3 分布式存储后端 350
7.3.1 Glusterfs 353
7.3.2 Ceph 359
第8章 行业案例分析 371
8.1 Vmware与Citrix组建银行私有云 371
8.1.1 用户需求简述 371
8.1.2 架构设计 371
8.1.3 项目实施 372
8.2 OpenStack构建高校私有云 372
8.2.1 用户需求简述 372
8.2.2 架构设计 373
8.2.3 项目实施 373
8.3 oVirt构建中学教学云桌面 373
8.3.1 用户需求简述 373
8.3.2 架构设计 373
8.3.3 项目实施 373
第9章 桌面云最佳实践 375
9.1 设备透传与重定向 375
9.1.1 PCI/PCI-E设备 375
9.1.2 SR-IOV 376
9.1.3 USB 382
9.1.4 串口与并口 386
9.2 GPU与桌面协议 387
9.2.1 物理显卡透传 388
9.2.2 开源vGPU实现 391
9.2.3 3D远程桌面协议 397
9.3 文件带外管理 403
9.3.1 技术基础 403
9.3.2 文件监控与审计 405
9.3.3 病毒集中扫描 405
9.3.4 软件增量分发 405
9.4 虚拟机体验优化 406
9.4.1 CPU 406
9.4.2 内存 409
9.4.3 硬盘 411
9.4.4 网络 413
9.4.5 VirtIO多队列 414
9.4.6 Guest Tools 414
9.4.7 单点登录 415
9.4.8 QEMU FT 415
9.5 服务器系统优化 418
9.5.1 性能监控工具 418
9.5.2 存储 420
9.5.3 网络 422
9.5.4 主板选项 422
9.6 客户端部署 423
9.7 P-V互迁 424
9.8 数据备份 430
9.8.1 离线备份 431
9.8.2 在线备份 431
第10章运维与测试工具 433
10.1 监控与日志 433
10.1.1 监控 433
10.1.2 日志 433
10.2 运维工具 433
10.2.1 主机配置管理 433
10.2.2 自动配置工具 433
10.3 负载均衡与高可用 433
10.3.1 负载均衡 434
10.3.2 高可用 434
10.4 测试 434
10.4.1 性能指标测试 435
10.4.2 功能单元测试 435
附录1 问题解决方法与建议 435
附录2 参考书目与项目 437

15. ovirt-shell

和许多虚拟化平台一样,ovirt也有自己的shell,并且在功能上有自己的特点。

连接ovirt-shell

# yum install ovirt-engine-cli
# ovirt-shell -l https://192.168.1.190 -u admin@internal -I

连接以后,可双击tab以自动补全。
你可以看出来,在ovirt中,vm、host、cluster等均可以看作“对象”,而其操作比如“start”、“stop”等可看作方法。

14. oVirt使用进阶

ovirt-shell

使用ovirt-shell在一定程度上适合于某些场景。

# ovirt-shell -I -u admin@internal -l https://server_ip/api

  ============================================================================
                      >>> connected to oVirt manager 3.4.0.0 <<<
  ============================================================================

  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

                             Welcome to oVirt shell

  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


[oVirt shell (connected)]#

主机hooks

参考vdsm-hooks

集群策略

参考这个PDF

libguestfs扩容

LVM扩容
普通扩容

UI plugin

ShellInABox

平台插件

Foreman
OpenStack Network
OpenStack Image

手动创建导出域

构造目录,形如:

# tree exports
.
├── 37e0e64b-5445-4bc3-8675-ceff4637e8e3/
│   ├── dom_md/
│   │   ├── ids
│   │   ├── inbox
│   │   ├── leases
│   │   ├── metadata
│   │   └── outbox
│   ├── images/
│   └── master/
│       ├── tasks/
│       └── vms/
└── __DIRECT_IO_TEST__

创建leases文件:

# echo 2d2d2d2d2d2d465245452d2d2d2d2d2d3030303030303030303030303030
303000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
0000 | xxd -r -p > leases

添加如下内容至metadata文件:

CLASS=Backup
DESCRIPTION=Export
IOOPTIMEOUTSEC=10
LEASERETRIES=3
LEASETIMESEC=60
LOCKPOLICY=
LOCKRENEWALINTERVALSEC=5
MASTER_VERSION=0
POOL_UUID=
REMOTE_PATH=192.168.1.108:/home/nfs_exports
ROLE=Regular
SDUUID=37e0e64b-5445-4bc3-8675-ceff4637e8e3
TYPE=NFS
VERSION=0
#_SHA_CKSUM=5737f1270bf93fdd660fea819655b01a34c315b9

使用如下脚本(参考源码中packaging/setup/plugins/ovirt-engine-setup/config/iso_domain.py)计算SHA校验值,并将其填入metadata中的_SHA_CKSUM段:

#!/usr/bin/python

import hashlib
from optparse import OptionParser

if __name__  == "__main__":
    parser = OptionParser()
    (options, args) = parser.parse_args()
    if len(args) != 1:
        parser.error("Missing metadata file")
    f = open(args[0], "r")
    mds = {}
    for line in f:
        line = line.strip()
        if not line or line.startswith('#'):
            continue
        try:
            key,value = line.split('=', 1)
            if key == '_SHA_CKSUM':
                continue
            mds[key] = value
        except Exception, e:
            continue
    f.close()
    #print mds
    csum = hashlib.sha1()
    keys = mds.keys()
    keys.sort()
    for key in keys:
        value = mds[key]
        line = "%s=%s" % (key, value)
        csum.update(line)
    print(csum.hexdigest())

更改权限:

# chown -R vdsm.kvm exports

然后可以作为空导出域进行导入。

13. 完善平台

我想,你看到这的话应该已经有了一个数据中心、几个宿主机,也可能有一个虚拟机(engine),还差一个存储虚拟机镜像的地方就可以具有基本功能的oVirt平台了。

Ok, here we go.

添加节点(宿主机)

对于第11节的普通oVirt、第12节的ha平台,你可能需要添加更多节点以支持更好的SLA(service level agreement)。
添加节点目前有三种方式,一种是通过oVirt的节点ISO直接安装再加入,另一种是直接将现有CentOS或者Fedora转化为节点,另外还可以指定使用外部节点(Foreman),在此我们使用第二种方法。

add_node_1

添加存储域

存储域有3种,Data(数据域)、ISO(ISO域)、Export(导出域)。
其中,数据域是为必需,在创建任何虚拟机之前需要有一个可用的数据域用于存储虚拟磁盘以及快照文件;ISO域中可以存放ISO和VFD格式的系统镜像或者驱动文件;导出域用于导出或导入OVF格式的虚机。
而根据数据域的存储类型,我们有5种(NFS、POSIX兼容、Glusterfs、iSCSI、光纤)可选,在此,建议(为什么?)选择glusterfs导出的NFS。

此图可以换掉
add_storage_1

注意:确保存储域权限为36:36,也即vdsm.kvm。

11. 搭建oVirt虚拟化平台

对于初次使用oVirt的用户,建议使用此种搭建方式,太折腾的话就吓走好多目标读者了

使用之前的四台机器,分别为gs1.lofyer.org,gs2.lofyer.org,gs3.lofyer.org和gs4.lofyer.org,其中,将gs1作为管理机安装ovirt-engine,其余三台作为节点(node),存储使用已经创建好的glusterfs。

准备

存储可以使用之前的glusterfs,方式为NFS_V3,注意将brick的权限设置为vdsm.kvm或者36:36。

# gluster volume create gluster-vol1 replica 2 gs1.example.com:/gluster_brick0 gs2.example.com:/gluster_brick0 gs3.example.com:/gluster_brick0 gs4.example.com:/gluster_brick0 gs1.example.com:/gluster_brick1 gs2.example.com:/gluster_brick1 gs3.example.com:/gluster_brick1 gs4.example.com:/gluster_brick1 force

gluster-ovirt

由于engine以及node的网络服务依赖于network而非NetworkManager,我们需要启用前者禁用后者,在每一台服务器上都进行如下类似配置修改网络。

NAME=eth0
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=static
# 注意修改此处的IP
IPADDR=192.168.10.101
NETMASK=255.255.255.0
GATEWAY=192.168.10.1
DNS1=192.168.10.1
# chkconfig NetworkManager off
# chkconfig network on
# service NetworkManager stop; service network restart

添加repo

# yum localinstall http://resources.ovirt.org/releases/ovirt-release.noarch.rpm
# yum install ovirt-hosted-engine-setup

或者手动添加:

[ovirt-stable]
name=Latest oVirt Releases
baseurl=http://ovirt.org/releases/stable/rpm/EL/$releasever/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[ovirt-3.4-stable]
name=Latest oVirt 3.4.z Releases
baseurl=http://ovirt.org/releases/3.4/rpm/EL/$releasever/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[epel]
name=Extra Packages for Enterprise Linux 6 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/6/$basearch
mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch
failovermethod=priority
enabled=1
includepkgs=epel-release,python-uinput,puppet,python-lockfile,python-cpopen,python-ordereddict,python-pthreading,python-inotify,python-argparse,novnc,python-ply,python-kitchen,python-daemon,python-websockify,livecd-tools,spice-html5,mom
gpgcheck=0

[ovirt-glusterfs-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes.
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/$basearch/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[ovirt-glusterfs-noarch-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes.
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/noarch
enabled=1
skip_if_unavailable=1
gpgcheck=0

安装管理节点。

在gs1上运行如下命令。

# yum install ovirt-engine
# engine-setup --offline
[ INFO  ] Stage: Initializing
[ INFO  ] Stage: Environment setup
          Configuration files: ['/etc/ovirt-engine-setup.conf.d/10-packaging.conf']
          Log file: /var/log/ovirt-engine/setup/ovirt-engine-setup-20140508054649.log
          Version: otopi-1.2.0 (otopi-1.2.0-1.el6)
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Stage: Environment customization
         
          --== PRODUCT OPTIONS ==--
         
         
          --== PACKAGES ==--
         
         
          --== NETWORK CONFIGURATION ==--
         
          Host fully qualified DNS name of this server [gs1.lofyer.org]: 
          Setup can automatically configure the firewall on this system.
          Note: automatic configuration of the firewall may overwrite current settings.
          Do you want Setup to configure the firewall? (Yes, No) [Yes]: 
          The following firewall managers were detected on this system: iptables
          Firewall manager to configure (iptables): iptables
[ INFO  ] iptables will be configured as firewall manager.
         
          --== DATABASE CONFIGURATION ==--
         
          Where is the Engine database located? (Local, Remote) [Local]: 
          Setup can configure the local postgresql server automatically for the engine to run. This may conflict with existing applications.
          Would you like Setup to automatically configure postgresql and create Engine database, or prefer to perform that manually? (Automatic, Manual) [Automatic]: 
         
          --== OVIRT ENGINE CONFIGURATION ==--
         
          Application mode (Both, Virt, Gluster) [Both]: 
          Default storage type: (NFS, FC, ISCSI, POSIXFS) [NFS]: 
          Engine admin password: 
          Confirm engine admin password: 
         
          --== PKI CONFIGURATION ==--
         
          Organization name for certificate [lofyer.org]: 
         
          --== APACHE CONFIGURATION ==--
         
          Setup can configure apache to use SSL using a certificate issued from the internal CA.
          Do you wish Setup to configure that, or prefer to perform that manually? (Automatic, Manual) [Automatic]: 
          Setup can configure the default page of the web server to present the application home page. This may conflict with existing applications.
          Do you wish to set the application as the default page of the web server? (Yes, No) [Yes]: 
         
          --== SYSTEM CONFIGURATION ==--
         
          Configure WebSocket Proxy on this machine? (Yes, No) [Yes]: 
          Configure an NFS share on this server to be used as an ISO Domain? (Yes, No) [Yes]: no
         
          --== MISC CONFIGURATION ==--
         
         
          --== END OF CONFIGURATION ==--
         
[ INFO  ] Stage: Setup validation
         
          --== CONFIGURATION PREVIEW ==--
         
          Engine database name                    : engine
          Engine database secured connection      : False
          Engine database host                    : localhost
          Engine database user name               : engine
          Engine database host name validation    : False
          Engine database port                    : 5432
          PKI organization                        : lofyer.org
          Application mode                        : both
          Firewall manager                        : iptables
          Update Firewall                         : True
          Configure WebSocket Proxy               : True
          Host FQDN                               : gs1.lofyer.org
          Datacenter storage type                 : nfs
          Configure local Engine database         : True
          Set application as default page         : True
          Configure Apache SSL                    : True
         
          Please confirm installation settings (OK, Cancel) [OK]: ok
[ INFO  ] Stage: Transaction setup
[ INFO  ] Stopping engine service
[ INFO  ] Stopping websocket-proxy service
[ INFO  ] Stage: Misc configuration
[ INFO  ] Stage: Package installation
[ INFO  ] Stage: Misc configuration
[ INFO  ] Initializing PostgreSQL
[ INFO  ] Creating PostgreSQL 'engine' database
[ INFO  ] Configuring PostgreSQL
[ INFO  ] Creating Engine database schema
[ INFO  ] Creating CA
[ INFO  ] Configuring WebSocket Proxy
[ INFO  ] Generating post install configuration file '/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf'
[ INFO  ] Stage: Transaction commit
[ INFO  ] Stage: Closing up
         
          --== SUMMARY ==--
         
          SSH fingerprint: 1B:FD:08:A2:FD:83:20:8A:65:F5:0D:F6:CB:BF:46:C7
          Internal CA 28:7E:D6:6B:F7:F2:6C:B5:60:27:44:C3:7F:3C:22:63:E5:68:DD:F4
          Web access is enabled at:
              http://gs1.lofyer.org:80/ovirt-engine
              https://gs1.lofyer.org:443/ovirt-engine
          Please use the user "admin" and password specified in order to login into oVirt Engine
         
          --== END OF SUMMARY ==--
         
[ INFO  ] Starting engine service
[ INFO  ] Restarting httpd
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20140508054842-setup.conf'
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20140508054649.log
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ INFO  ] Execution of setup completed successfully

至此,管理节点安装结束。

接下来,加入节点以及存储域,请参考第13节。

12. 搭建高可用oVirt(hosted engine)

笔者写此文时oVirt已经更新到3.4。

在此,我们会用到之前创建的distributed-replicate存储,这样可用保证系统服务的高度可用性有所提高。

有以下几点需要注意:

1. 宿主机的CPU架构需要为Westmere(Westmere E56xx/L56xx/X56xx)、Nehalem(Intel Core i7 9xx)、Penryn(Intel Core 2 Duo P9xxx)或者Conroe(Intel Celeron_4x0)中的之一,否则集群类型会与宿主机类型不兼容导致数据中心启动失败。

CPU Family table 参阅
Intel Architecture and Processor Identification With CPUID Model and Family Numbers

2. 建议参考第11节提前安装含有oVirt管理的虚拟机,硬盘格式为RAW,从而在安装管理机时作为OVF导入或者覆盖虚拟磁盘,减少失败风险时间。

准备

每台机器上都要添加作为虚拟机运行的engine的FQDN,此处为ha.lofyer.org。

# echo -e '192.168.10.100\tha.lofyer.org' >> /etc/hosts

存储可以使用之前的glusterfs,方式为NFS_V3,注意将brick的权限设置为vdsm.kvm或者36:36。

# gluster volume create gluster-vol1 replica 2 gs1.example.com:/gluster_brick0 gs2.example.com:/gluster_brick0 gs3.example.com:/gluster_brick0 gs4.example.com:/gluster_brick0 gs1.example.com:/gluster_brick1 gs2.example.com:/gluster_brick1 gs3.example.com:/gluster_brick1 gs4.example.com:/gluster_brick1 force

gluster-ovirt

由于engine以及node的网络服务依赖于network而非NetworkManager,我们需要启用前者禁用后者,在每一台服务器上都进行如下类似配置修改网络。

NAME=eth0
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=static
# 注意修改此处的IP
IPADDR=192.168.10.101
NETMASK=255.255.255.0
GATEWAY=192.168.10.1
DNS1=192.168.10.1
# chkconfig NetworkManager off
# chkconfig network on
# service NetworkManager stop; service network restart

添加repo

# yum localinstall http://resources.ovirt.org/releases/ovirt-release.noarch.rpm
# yum install ovirt-hosted-engine-setup

或者手动添加:

[ovirt-stable]
name=Latest oVirt Releases
baseurl=http://ovirt.org/releases/stable/rpm/EL/$releasever/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[ovirt-3.4-stable]
name=Latest oVirt 3.4.z Releases
baseurl=http://ovirt.org/releases/3.4/rpm/EL/$releasever/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[epel]
name=Extra Packages for Enterprise Linux 6 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/6/$basearch
mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch
failovermethod=priority
enabled=1
includepkgs=epel-release,python-uinput,puppet,python-lockfile,python-cpopen,python-ordereddict,python-pthreading,python-inotify,python-argparse,novnc,python-ply,python-kitchen,python-daemon,python-websockify,livecd-tools,spice-html5,mom
gpgcheck=0

[ovirt-glusterfs-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes.
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/$basearch/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[ovirt-glusterfs-noarch-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes.
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/noarch
enabled=1
skip_if_unavailable=1
gpgcheck=0

安装hosted-engine

# yum install ovirt-hosted-engine-setup

回答一些列问题,如下所示:

# hosted-engine --deploy
[ INFO  ] Stage: Initializing
          Continuing will configure this host for serving as hypervisor and create a VM where you have to install oVirt Engine afterwards.
          Are you sure you want to continue? (Yes, No)[Yes]: yes
[ INFO  ] Generating a temporary VNC password.
[ INFO  ] Stage: Environment setup
          Configuration files: []
          Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20140508182241.log
          Version: otopi-1.2.0 (otopi-1.2.0-1.el6)
[ INFO  ] Hardware supports virtualization
[ INFO  ] Bridge ovirtmgmt already created
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Stage: Environment customization
         
          --== STORAGE CONFIGURATION ==--
         
          During customization use CTRL-D to abort.
          Please specify the storage you would like to use (nfs3, nfs4)[nfs3]: 
# 此处的存储域只存储hosted-engine的相关文件,不作为主数据域
          Please specify the full shared storage connection path to use (example: host:/path): 192.168.10.101:/gluster-vol1/ovirt_data/hosted_engine
[ INFO  ] Installing on first host
          Please provide storage domain name. [hosted_storage]: 
          Local storage datacenter name is an internal name and currently will not be shown in engine's admin UI.Please enter local datacenter name [hosted_datacenter]: 
         
          --== SYSTEM CONFIGURATION ==--
         
         
          --== NETWORK CONFIGURATION ==--
         
          iptables was detected on your computer, do you wish setup to configure it? (Yes, No)[Yes]: no
          Please indicate a pingable gateway IP address [192.168.10.1]: 
         
          --== VM CONFIGURATION ==--
# 虚拟engine的安装方式         
          Please specify the device to boot the VM from (cdrom, disk, pxe) [cdrom]: 
          The following CPU types are supported by this host:
          	 - model_Conroe: Intel Conroe Family
          Please specify the CPU type to be used by the VM [model_Conroe]: 
          Please specify path to installation media you would like to use [None]: /tmp/centos.iso
          Please specify the number of virtual CPUs for the VM [Defaults to minimum requirement: 2]: 
          Please specify the disk size of the VM in GB [Defaults to minimum requirement: 25]: 
          You may specify a MAC address for the VM or accept a randomly generated default [00:16:3e:59:9b:e2]: 
          Please specify the memory size of the VM in MB [Defaults to minimum requirement: 4096]: 4096
          Please specify the console type you would like to use to connect to the VM (vnc, spice) [vnc]: 
         
          --== HOSTED ENGINE CONFIGURATION ==--
         
          Enter the name which will be used to identify this host inside the Administrator Portal [hosted_engine_1]: 
          Enter 'admin@internal' user password that will be used for accessing the Administrator Portal: 
          Confirm 'admin@internal' user password: 
          Please provide the FQDN for the engine you would like to use.
          This needs to match the FQDN that you will use for the engine installation within the VM.
          Note: This will be the FQDN of the VM you are now going to create,
          it should not point to the base host or to any other existing machine.
          Engine FQDN: ha.lofyer.org
[WARNING] Failed to resolve ha.lofyer.org using DNS, it can be resolved only locally
          Please provide the name of the SMTP server through which we will send notifications [localhost]: 
          Please provide the TCP port number of the SMTP server [25]: 
          Please provide the email address from which notifications will be sent [root@localhost]: 
          Please provide a comma-separated list of email addresses which will get notifications [root@localhost]: 
[ INFO  ] Stage: Setup validation
         
          --== CONFIGURATION PREVIEW ==--
         
          Engine FQDN                        : ha.lofyer.org
          Bridge name                        : ovirtmgmt
          SSH daemon port                    : 22
          Gateway address                    : 192.168.10.1
          Host name for web application      : hosted_engine_1
          Host ID                            : 1
          Image size GB                      : 25
          Storage connection                 : 192.168.10.101:/gluster-vol1/ovirt_data/hosted_data/
          Console type                       : vnc
          Memory size MB                     : 4096
          MAC address                        : 00:16:3e:59:9b:e2
          Boot type                          : cdrom
          Number of CPUs                     : 2
          ISO image (for cdrom boot)         : /tmp/centos.iso
          CPU Type                           : model_Conroe
         
          Please confirm installation settings (Yes, No)[No]: yes
[ INFO  ] Generating answer file '/etc/ovirt-hosted-engine/answers.conf'
[ INFO  ] Stage: Transaction setup
[ INFO  ] Stage: Misc configuration
[ INFO  ] Stage: Package installation
[ INFO  ] Stage: Misc configuration
[ INFO  ] Configuring libvirt
[ INFO  ] Configuring VDSM
[ INFO  ] Starting vdsmd
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Creating Storage Domain
[ INFO  ] Creating Storage Pool
[ INFO  ] Connecting Storage Pool
[ INFO  ] Verifying sanlock lockspace initialization
[ INFO  ] Initializing sanlock lockspace
[ INFO  ] Initializing sanlock metadata
[ INFO  ] Creating VM Image
[ INFO  ] Disconnecting Storage Pool
[ INFO  ] Start monitoring domain
[ INFO  ] Configuring VM
[ INFO  ] Updating hosted-engine configuration
[ INFO  ] Stage: Transaction commit
[ INFO  ] Stage: Closing up
          The following network ports should be opened:
              tcp:5900
              tcp:5901
              udp:5900
              udp:5901
          An example of the required configuration for iptables can be found at:
              /etc/ovirt-hosted-engine/iptables.example
          In order to configure firewalld, copy the files from
          /etc/ovirt-hosted-engine/firewalld to /etc/firewalld/services
          and execute the following commands:
              firewall-cmd -service hosted-console
[ INFO  ] Creating VM
          You can now connect to the VM with the following command:
          	/usr/bin/remote-viewer vnc://localhost:5900
          Use temporary password "2067OGHU" to connect to vnc console.
          Please note that in order to use remote-viewer you need to be able to run graphical applications.
          This means that if you are using ssh you have to supply the -Y flag (enables trusted X11 forwarding).
          Otherwise you can run the command from a terminal in your preferred desktop environment.
          If you cannot run graphical applications you can connect to the graphic console from another host or connect to the console using the following command:
          virsh -c qemu+tls://192.168.1.150/system console HostedEngine
          If you need to reboot the VM you will need to start it manually using the command:
          hosted-engine --vm-start
          You can then set a temporary password using the command:
          hosted-engine --add-console-password
          The VM has been started.  Install the OS and shut down or reboot it.  To continue please make a selection:
         
          (1) Continue setup - VM installation is complete
          (2) Reboot the VM and restart installation
          (3) Abort setup
# 需要在另外一个有图形能力的terminal中运行"remote-viewer vnc://192.168.10.101:5900"连接虚拟机并关闭(密码在第123行),如有需要可运行"hosted-engine --add-console-password"更换控制台密码。如果之前选择cdrom进行安装的话,此处可以在gs1上用已经安装好engine的虚拟磁盘进行覆盖,类似"mount -t nfs 192.168.10.101:192.168.10.101:/gluster-vol1/ovirt_data/hosted_data/ /mnt; engine.raw /mnt/ovirt_data/hosted_data/.../vm_UUID"
          (1, 2, 3)[1]: 1
          Waiting for VM to shut down...
[ INFO  ] Creating VM
          You can now connect to the VM with the following command:
          	/usr/bin/remote-viewer vnc://localhost:5900
          Use temporary password "2067OGHU" to connect to vnc console.
          Please note that in order to use remote-viewer you need to be able to run graphical applications.
          This means that if you are using ssh you have to supply the -Y flag (enables trusted X11 forwarding).
          Otherwise you can run the command from a terminal in your preferred desktop environment.
          If you cannot run graphical applications you can connect to the graphic console from another host or connect to the console using the following command:
          virsh -c qemu+tls://192.168.1.150/system console HostedEngine
          If you need to reboot the VM you will need to start it manually using the command:
          hosted-engine --vm-start
          You can then set a temporary password using the command:
          hosted-engine --add-console-password
          Please install and setup the engine in the VM.
          You may also be interested in installing ovirt-guest-agent-common package in the VM.
          To continue make a selection from the options below:
          (1) Continue setup - engine installation is complete
          (2) Power off and restart the VM
          (3) Abort setup
# 此处参考第一次操作,连接虚拟机控制台后进行"engine-setup --offline"以安装engine
          (1, 2, 3)[1]: 1
[ INFO  ] Engine replied: DB Up!Welcome to Health Status!
[ INFO  ] Waiting for the host to become operational in the engine. This may take several minutes...
[ INFO  ] Still waiting for VDSM host to become operational...
[ INFO  ] The VDSM Host is now operational
          Please shutdown the VM allowing the system to launch it as a monitored service.
# 到此,需要连接虚拟机控制台关闭虚拟机
          The system will wait until the VM is down.
[ INFO  ] Enabling and starting HA services
          Hosted Engine successfully set up
[ INFO  ] Stage: Clean up
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination

此时,运行”hosted-engine –vm-start”启动engine。

10. oVirt简介

Welcome to the core!

云计算目前主流实现有SaaS(Software-as-a-service)、PaaS(Platform-as-a-service)和IaaS(Infrastructure-as-a-service)。IaaS和PaaS都算作基础件,SaaS可以与基础件自由组合或者单独使用。

虚拟化技术已经很受重视而且被推到了一个浪尖。如今诸多开源虚拟化平台,比如XenServer、CloudStack、OpenStack、Eucalyptus、oVirt、OpenVZ等,我们都看花了眼,些许慌乱不知哪个适合自己了。

各平台实现方式:全虚拟化,半虚拟化,TBD

我在写这篇文档的时候,只对oVirt略知一二,其他平台(XenServer、OpenStack)稍懂皮毛,再剩下的我就没怎么用过了。那么,只挑最熟悉的来讲吧。

IaaS云计算平台,综合来说具有以下特性:

虚拟化

虚拟化作为云计算平台的核心,是资源利用的主要形式之一。网络、存储、CPU乃至GPU等主要通过虚拟主机进行实体化。

分布式

分布式可利用共享的存储,通过网络将资源进行整合,是实现资源化的必备条件。

高可用

对于规模庞大的云平台,提供管理节点、重要服务的高度可用性是十分必要的。笔者在写这篇文章时,oVirt 3.4已经可以做到管理节点的高度可用。

兼容性

云计算平台众多,各家有各家的特点,同一数据中心部署不同的平台的可能性极大,因此,主要服务(比如虚拟网络、存储、虚机等)要有一定兼容性,比如oVirt可以利用OpenStack的Nouveau提供的虚拟网络、Foreman可以方便地在oVirt上部署新机器等。

另外,也有DeltaCloud、libvirt等API,用户可以利用它们自由地实现自己的云综合管理工具。

资源池化

网络、存储、CPU或者GPU可以综合或者单独划分资源池,通过配额进行分配,从而保证其合理利用。

安全性

现代企业对于安全性的要求已经十分苛刻,除去传统数据加密、访问控制,甚至对于社会工程也要有一定防护能力;用户数据具有对非企业管理员具有防护性能,即使将虚拟机磁盘文件拷贝出来也不能直接获取其内容。

需求导向性

在计算水平上,优质资源最先提供给重要服务;服务水平上,平台具有可定制能力。

oVirt物理层视图

管理独占一台物理机

Normal

高可用管理机

HA

内容预览

第11(6-2)搭建管理引擎;
第12(6-3)搭建高可用管理引擎;
第13(6-4)加入节点,构建一个完整的云平台;
第14(6-5)应用进阶;

9. glusterfs应用示例及技巧

文件权限

glusterfs在创建卷时会更改砖块所有者为root.root,对于某些应用请注意更改砖块目录所有者(比如在/etc/rc.local中添加chown,不要更改砖块下隐藏目录.glusterfs)。

砖块分割

前文所述的砖块划分方式在某些场景下不是很好,可以在/brickX下创建目录,比如data1,同时在创建glusterfs卷的时候使用HOST:/brickX/data1作为砖块,以合理利用存储空间。

normal、replica、striped卷组合

砖块的划分排序:striped(normal)优先,replica在striped(normal)基础上做冗余;计算大小时,同一replica组中的brick进行合并(多个算作一个),一个striped组可看做一个有效块,。
假设我们有4个主机,8个砖块,每个砖块都是5GB,如下图,

brick-base

创建卷时,使用如下命令:

# gluster volume create gluster-vol1 stripe 2 replica 2 host1:/brick1 host1:/brick2 host2:/brick1 host2:/brick2 host3:/brick1 host3:/brick2 host4:/brick1 host4:/brick2 force

则会进行下列组合:

brick-1

创建卷时,使用如下命令:

# gluster volume create gluster-vol1 stripe 2 replica 2 host1:/brick1 host2:/brick1 host3:/brick1 host4:/brick1 host1:/brick2 host2:/brick2 host3:/brick2 host4:/brick2 force

则会进行下列组合(注意颜色,可改为实虚线):

brick-2

作为nfs挂载

由于glusterfs占用了2049端口,所以其与nfs server一般不能共存于同一台服务器,除非更改nfs服务端口。

mount -t nfs -o vers=3 server1:/volume1 /mnt

作为cifs挂载

先在某一服务器或者客户端将起挂载,再以cifs方式导出

 [glustertest]
comment = For testing a Gluster volume exported through CIFS
path = /mnt/glusterfs
read only = no
guest ok = yes

进阶手册

1. 裂脑(split-brain)修复

裂脑发生以后,各节点信息可能会出现不一致。可以通过以下步骤查看并修复。

定位裂脑文件
通过命令

# gluster volume heal info split-brain

或者查看在客户端仍然是Input/Output错误的文件。
关闭已经打开的文件或者虚机
确定正确副本
恢复扩展属性

2. 砖块重用

当一个volume正在使用时,你删除了其中一个brick,会出现“/bricks/app or a prefix of it is already part of a volume”,对于3.3版本以后的glusterfs有此问题。
解决方法:

# setfattr -x trusted.glusterfs.volume-id $brick_path
# setfattr -x trusted.gfid $brick_path
# rm -rf $brick_path/.glusterfs

3. 扩展

对于中等以上规模的部署,需要使用dns服务器去解析各个节点以免去修改hosts文件的麻烦。

8. 搭建glusterfs作为基础存储

既然要搭建一个稳健的基础,那么glusterfs在此使用distributed striped replicated方式,这里使用4台预装CentOS 6(SELINUX设置为permissive)的机器。

Distributed_Striped_Replicated_Volume

添加DNS或者修改hosts文件

鉴于笔者所在环境中暂时没有配置独立的DNS,此处先修改hosts文件以完成配置(每台机器上都如此设置):

# echo -e "192.168.10.101\tgs1.example.com\n192.168.10.102\tgs2.example.com\n192.168.10.103\tgs3.example.com\n192.168.10.104\tgs4.example.com" >> /etc/hosts

添加repo

[epel]
name=Extra Packages for Enterprise Linux 6 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/6/$basearch
mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch
failovermethod=priority
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6

[glusterfs-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes.
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/$basearch/
enabled=1
skip_if_unavailable=1
gpgcheck=1
gpgkey=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/pub.key

[glusterfs-noarch-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes.
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/noarch
enabled=1
skip_if_unavailable=1
gpgcheck=1
gpgkey=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/pub.key

[glusterfs-source-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes. - Source
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/SRPMS
enabled=0
skip_if_unavailable=1
gpgcheck=1
gpgkey=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/pub.key

准备磁盘

每一个节点都可以看做gluster server,安装xfs用户空间工具:

# yum install -y glusterfs glusterfs-fuse glusterfs-server xfsprogs
# /etc/init.d/glusterd start
# /etc/init.d/glusterfsd start
# chkconfig glusterfsd on
# chkconfig glusterd on

假如每台机器除系统盘之外都有2块1T SATA硬盘。
对其进行分区,创建逻辑卷,格式化并挂载:

# fdisk /dev/sdX << EOF
n
p
1

w
EOF

直接使用物理盘:

# mkfs.xfs -i size 512 /dev/sdb1
# mkfs.xfs -i size 512 /dev/sdc1
# mkdir /gluster_brick0
# mkdir /gluster_brick1
# echo -e "/dev/sdb1\t/gluster_brick0\txfs\tdefaults\t0 0\n/dev/sdc1\t/gluster_brick1\txfs\tdefaults\t0 0" >> /etc/fstab
# mount -a

或者使用逻辑卷:

# pvcreate /dev/sdb1 /dev/sdc1
# vgcreate vg_gluster /dev/sdb1 /dev/sdc1
# lvcreate --name lv_gluster --size 2500G vg_gluster
# mkfs.xfs -i size 512 /dev/vg_gluster-lv_gluster
# mkdir /gluster_brick
# echo -e "/dev/mapper/vg_gluster-lv_gluster\t/gluster_brick\txfs\tdefaults\t0 0" >> /etc/fstab
# mount -a

为什么要用XFS?
XFS具有元数据日志功能,可以快速恢复数据;同时,可以在线扩容及碎片整理。其他文件系统比如EXT3,EXT4未做充分测试。

配置节点,添加volume

在其中任何一台机器上,比如gs2.example.com,执行:

# gluster
  > peer probe gs1.example.com
  > peer probe gs2.example.com

添加brick至volume,合理调整砖块顺序

# gluster
  > volume create gluster-vol1 stripe 2 replica 2 gs1.example.com:/gluster_brick0 gs1.example.com:/gluster_brick1 gs2.example.com:/gluster_brick0 gs2.example.com:/gluster_brick1 gs3.example.com:/gluster_brick0 gs3.example.com:/gluster_brick1 gs4.example.com:/gluster_brick0 gs4.example.com:/gluster_brick1 force
  > volume start gluster-vol1
  > volume status
Status of volume: gluster-vol1
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick gs1.example.com:/gluster_brick0                   49152   Y       1984
Brick gs1.example.com:/gluster_brick1                   49153   Y       1995
Brick gs2.example.com:/gluster_brick0                   49152   Y       1972
Brick gs2.example.com:/gluster_brick1                   49153   Y       1983
Brick gs3.example.com:/gluster_brick0                   49152   Y       1961
Brick gs3.example.com:/gluster_brick1                   49153   Y       1972
Brick gs4.example.com:/gluster_brick0                   49152   Y       1975
Brick gs4.example.com:/gluster_brick1                   49153   Y       1986
NFS Server on localhost                                 2049    Y       1999
Self-heal Daemon on localhost                           N/A     Y       2006
NFS Server on gs2.example.com                           2049    Y       2007
Self-heal Daemon on gs2.example.com                     N/A     Y       2014
NFS Server on gs2.example.com                           2049    Y       1995
Self-heal Daemon on gs2.example.com                     N/A     Y       2002
NFS Server on gs3.example.com                           2049    Y       1986
Self-heal Daemon on gs3.example.com                     N/A     Y       1993
 
Task Status of Volume gluster-vol1
------------------------------------------------------------------------------
There are no active volume tasks
  > volume info all
gluster volume info all
 
Volume Name: gluster-vol1
Type: Distributed-Striped-Replicate
Volume ID: bc8e102c-2b35-4748-ab71-7cf96ce083f3
Status: Started
Number of Bricks: 2 x 2 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gs1.example.com:/gluster_brick0
Brick2: gs1.example.com:/gluster_brick1
Brick3: gs2.example.com:/gluster_brick0
Brick4: gs2.example.com:/gluster_brick1
Brick5: gs3.example.com:/gluster_brick0
Brick6: gs3.example.com:/gluster_brick1
Brick7: gs4.example.com:/gluster_brick0
Brick8: gs4.example.com:/gluster_brick1

客户端挂载glusterfs

当用glusterfs-fuse挂载时,客户端的hosts文件里需要有gluster server中的任一节点做解析:

127.0.0.1       localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6

192.168.1.81    gs1.example.com

安装glusterfuse,将gluster卷作为glusterfs挂载,并写入1M文件查看其在各砖块分配:

# yum install glusterfs glusterfs-fuse
# mount.glusterfs 192.168.1.81:/gluster-vol1 /mnt
# cd /mnt
# dd if=/dev/zero of=a.img bs=1k count=1k
# cp a.img b.img; cp a.img c.img; cp a.img d.img

在四台服务端分别查看:

[root@gs1 ~]# ls -lh /gluster_brick*
/gluster_brick0:
total 1.0M
-rw-r--r--. 2 root root 512K Apr 22 17:13 a.img
-rw-r--r--. 2 root root 512K Apr 22 17:13 d.img
/gluster_brick1:
total 1.0M
-rw-r--r--. 2 root root 512K Apr 22 17:13 a.img
-rw-r--r--. 2 root root 512K Apr 22 17:13 d.img
[root@gs2 ~]# ls -lh /gluster_brick*
/gluster_brick0:
total 1.0M
-rw-r--r--. 2 root root 512K Apr 22 17:13 a.img
-rw-r--r--. 2 root root 512K Apr 22 17:13 d.img
/gluster_brick1:
total 1.0M
-rw-r--r--. 2 root root 512K Apr 22 17:13 a.img
-rw-r--r--. 2 root root 512K Apr 22 17:13 d.img
[root@gs3 ~]# ls -lh /gluster_brick*
/gluster_brick0:
total 1.0M
-rw-r--r--. 2 root root 512K Apr 22 17:13 b.img
-rw-r--r--. 2 root root 512K Apr 22 17:13 c.img
/gluster_brick1:
total 1.0M
-rw-r--r--. 2 root root 512K Apr 22 17:13 b.img
-rw-r--r--. 2 root root 512K Apr 22 17:13 c.img
[root@gs4 ~]# ls -lh /gluster_brick*
/gluster_brick0:
total 1.0M
-rw-r--r--. 2 root root 512K Apr 22 17:13 b.img
-rw-r--r--. 2 root root 512K Apr 22 17:13 c.img
/gluster_brick1:
total 1.0M
-rw-r--r--. 2 root root 512K Apr 22 17:13 b.img
-rw-r--r--. 2 root root 512K Apr 22 17:13 c.img

至此,所有配置结束,下一篇说一下使用以及部分trick。