Swift源码分析----swift-account-audit(1)

本帖最后由 pig2 于 2014-11-21 15:15 编辑
问题导读

1、如何获取类Auditor的实例化对象？
2、怎样指定object的审计验证呢？
3、如何获取指定name对象的所有副本的相关节点和分区号？

概述部分：
这个脚本实现命令行指定账户或容器或对象的审计验证操作；
根据具体参数情况实现操作：
指定object的审计验证；
指定container的审计验证，并实现递归验证container下每个object；
指定account的审计验证，并实现递归验证account下每个container，并且进一步实现递归验证container下每个object；

Examples:
    /usr/bin/swift-account-audit SOSO_88ad0b83-b2c5-4fa1-b2d6-60c597202076
    /usr/bin/swift-account-audit SOSO_88ad0b83-b2c5-4fa1-b2d6-60c597202076/container/object
    /usr/bin/swift-account-audit -e errors.txt SOSO_88ad0b83-b2c5-4fa1-b2d6-60c597202076/container
    /usr/bin/swift-account-audit 
复制代码

源码解析部分：

if __name__ == '__main__':
    try:
        optlist, args = getopt.getopt(sys.argv[1:], 'c:r:e:d')
    except getopt.GetoptError as err:
        print str(err)
        print usage
        sys.exit(2)
    if not args and os.isatty(sys.stdin.fileno()):
        print usage
        sys.exit()
    opts = dict(optlist)
    options = {
        'concurrency': int(opts.get('-c', 50)),
        'error_file': opts.get('-e', None),
        'swift_dir': opts.get('-r', '/etc/swift'),
        'deep': '-d' in opts,
    }
    auditor = Auditor(**options)
    if not os.isatty(sys.stdin.fileno()):
        args = chain(args, sys.stdin)
    
     # 这个循环说明可以在一个命令行中同时进行多个目标的审计验证操作；
    for path in args:
        path = '/' + path.rstrip('\r\n').lstrip('/')
          # 根据具体参数情况实现操作：
          # 指定object的审计验证；
          # 指定container的审计验证，并实现递归验证container下每个object；
          # 指定account的审计验证，并实现递归验证account下每个container，并且进一步实现递归验证container下每个object；
        auditor.audit(*split_path(path, 1, 3, True))
    auditor.wait()
    auditor.print_stats()1.命令行选项处理；
复制代码

2.获取类Auditor的实例化对象；
3.auditor.audit(*split_path(path, 1, 3, True))根据命令行中account/container/object参数的不同情况，调用不同的方法，实现account/container/object的审计操作；
4.输出审计结果；
转到3，来看方法audit：

def audit(self, account, container=None, obj=None):
        """
        根据具体参数情况实现操作：
        指定object的审计验证；
        指定container的审计验证，并实现递归验证container下每个object；
        指定account的审计验证，并实现递归验证account下每个container，并且进一步实现递归验证container下每个object；
        """
        # 指定object的审计验证；
        if obj and container:
            self.pool.spawn_n(self.audit_object, account, container, obj)
        # 指定container的审计验证，并实现递归验证container下每个object；
        elif container:
            self.pool.spawn_n(self.audit_container, account, container, True)
        # 指定account的审计验证，并实现递归验证account下每个container，并且进一步实现递归验证container下每个object；
        else:
            self.pool.spawn_n(self.audit_account, account, True)3.1 audit_object方法实现指定object的审计验证；
复制代码

3.2 audit_container方法实现指定指定container的审计验证，并实现递归验证container下每个object；
3.3 audit_account方法实现指定account的审计验证，并实现递归验证account下每个container，并且进一步实现递归验证container下每个object；

转到3.1，来看方法audit_object的实现：

def audit_object(self, account, container, name):
        """
        指定object的审计验证；
        """
      # 获取指定account和container下的对象具体路径；
      path = '/%s/%s/%s' % (account, container, name)
        
      # 获取指定name对象的所有副本的相关节点和分区号；
      # 获取account/container/object所对应的分区号和节点（可能是多个，因为分区副本有多个，可能位于不同的节点上）；
      # 返回元组（分区，节点信息列表）；
      # 在节点信息列表中至少包含id、weight、zone、ip、port、device、meta；
      part, nodes = self.object_ring.get_nodes(account, container.encode('utf-8'), name.encode('utf-8'))
        
      # 获取指定account和container下的对象列表；
      container_listing = self.audit_container(account, container)
      consistent = True
      if name not in container_listing:
          print "  Object %s missing in container listing!" % path
          consistent = False
          hash = None
      else:
          hash = container_listing[name]['hash']
        
      etags = []
        
      #查询每个节点上指定part的信息；
      for node in nodes:
          try:
              if self.deep:
                  # 获取到服务的连接；
                  conn = http_connect(node['ip'], node['port'], node['device'], part, 'GET', path, {})
                  resp = conn.getresponse()
                  calc_hash = md5()
                  chunk = True
                  while chunk:
                      chunk = resp.read(8192)
                      calc_hash.update(chunk)
                  calc_hash = calc_hash.hexdigest()
                  if resp.status // 100 != 2:
                      self.object_not_found += 1
                      consistent = False
                      print '  Bad status GETting object "%s" on %s/%s' % (path, node['ip'], node['device'])
                      continue
                  if resp.getheader('ETag').strip('"') != calc_hash:
                      self.object_checksum_mismatch += 1
                      consistent = False
                      print '  MD5 does not match etag for "%s" on %s/%s' % (path, node['ip'], node['device'])
                  etags.append(resp.getheader('ETag'))
              else:
                  conn = http_connect(node['ip'], node['port'],
                                      node['device'], part, 'HEAD',
                                      path.encode('utf-8'), {})
                  resp = conn.getresponse()
                  if resp.status // 100 != 2:
                      self.object_not_found += 1
                      consistent = False
                      print '  Bad status HEADing object "%s" on %s/%s' % (path, node['ip'], node['device'])
                      continue
                  etags.append(resp.getheader('ETag'))
          except Exception:
              self.object_exceptions += 1
              consistent = False
              print '  Exception fetching object "%s" on %s/%s' % (path, node['ip'], node['device'])
              continue
      if not etags:
          consistent = False
          print "  Failed fo fetch object %s at all!" % path
      elif hash:
          for etag in etags:
              if resp.getheader('ETag').strip('"') != hash:
                  consistent = False
                  self.object_checksum_mismatch += 1
                  print '  ETag mismatch for "%s" on %s/%s' % (path, node['ip'], node['device'])
      if not consistent and self.error_file:
          print >>open(self.error_file, 'a'), path
      self.objects_checked += 13.1.1 获取指定account和container下的对象具体路径；
复制代码

3.1.2 获取指定name对象的所有副本的相关节点和分区号；
3.1.3 调用方法audit_container实现获取指定account和container下的对象列表，验证当前指定对象是否包含其中；如果确实包含其中，获取对象的hash值；
3.1.4 针对对象的所有副本相关节点，进行遍历，对于每个节点执行以下操作：
（1）如果deep值为True，说明进行深度验证，则通过HTTP应用GET方法远程获取节点的验证响应信息，首先通过响应信息的状态值，判断远程副本对象节点是否存在，再通过比较远程副本对象的ETag和MD5哈希值，判断远程副本对象是否有效；
（2）如果deep值为False，说明不进行深度验证，则通过HTTP应用HEAD方法远程获取节点的响应头信息，通过响应信息的状态值，判断远程副本对象节点是否存在；
3.1.5 比较本地对象的哈希值和各个远程副本对象的ETag，以判断远程副本对象是否有效；
本想把内容都写在一篇博客中，但是几次下来发现篇幅太长真的难以维护格式，所以只能分开多篇来实现了！

下一篇：
Swift源码分析----swift-account-audit(2)

图文精华

Swift源码分析----swift-account-audit(1)

推荐 /2