2011年01月29日 星期六 09:14
春节长假将至,有些系统管理员们被老板要求写一份公司的软硬件维护清单,对于没写过此类文档的运维朋友们而言会感到很苦恼。
系统维护清单该怎么写?
其实不光是在长假前后,系统管理员平时也应该养成按时(比如每天、每周、每月)按照维护清单进行软硬件维护的习惯。
简单而言,系统维护主要包括如下几个方面:
从某种角度而言,系统维护清单都应该是系统管理员们必须遵守的铁律。
具体的系统维护清单,其实不少厂商(尤其是微软和IBM)都提供了软硬件维护清单的参考文档。可惜的是,大部分都还没有翻译成中文(这也是为什么技术人学好英文很重要,因为太多资料手册都是English Only)。下面摘录部分相关文档,以供大家参考。
微软BizTalk Server维护清单参考文档
Steps | Reference |
---|---|
Check for failed disks in the hardware RAID (reliability check). |
"View Disk Properties" in the Windows Server 2003 product Help at http://go.microsoft.com/fwlink/?linkid=104161 |
Check for messages requiring manual intervention such as suspended messages (reliability check). |
For information about manually checking for suspended messages see "Investigating Orchestration, Port, and Message Failures" in BizTalk Server 2006 R2 Help at http://go.microsoft.com/fwlink/?linkid=104169 For information about performing automated monitoring using Microsoft Operations Manager 2005 see "Suspended Message Alerts" at http://go.microsoft.com/fwlink/?linkid=105059 |
Check the event logs for errors and warnings (administration check). |
BizTalk Server 2006 R2 errors and warning events are saved in the application log. The event source is "BizTalk Server 2006". We recommend that you monitor the event log using an automated solution such as Microsoft System Center Operations Manager. For more information, see Monitoring with MOM 2005 or Operations Manager 2007 . |
Steps | Reference |
---|---|
Ensure that each host has an instance running on at least two physical BizTalk servers (reliability check). |
|
Ensure that each receive location is redundant (reliability check). |
|
Ensure that the SQL Server Agent service is running on the SQL server (administration check). |
|
Ensure that all SQL Server jobs related to BizTalk Server are working properly (administration check). |
|
Ensure that the SQL Server jobs responsible for backing up BizTalk Server databases are running normally (administration check). |
|
Ensure that the latest security updates are installed (security check). |
Microsoft Update site at http://update.microsoft.com/microsoftupdate/v6/default.aspx |
Analyze weekly performance monitoring logs against baseline and thresholds (performance check). |
|
Ensure that the system is not experiencing frequent auto-growth of BizTalk Server databases (performance check). |
|
Run SQL Server Profiler during high load to check for long response times and high resource usage (performance check). |
"Using SQL Server Profiler" in the SQL Server 2005 Books Online at http://go.microsoft.com/fwlink/?LinkID=106720 |
Ensure that message batching for all adapters is appropriate for resource consumption or latency (performance check). |
|
Ensure that the large message threshold is appropriate for resource consumption (performance check). |
|
Steps | Reference |
---|---|
Ensure the master secret key is backed up and readily available on offline storage (reliability check). |
|
Ensure that failover for all clustered services has been tested (reliability check). |
|
Ensure that the Enterprise SSO service is clustered (reliability check). |
|
Ensure that the BizTalk Server databases are clustered under SQL Server services (reliability check). |
|
Ensure that at least two physical BizTalk servers are part of the BizTalk group (reliability check). |
|
Determine whether any unstable code is being used, and if so, use separate hosts (reliability check). |
|
Perform functional testing of all new BizTalk applications (reliability check). |
|
Determine whether there are any unnecessary BizTalk applications, artifacts, and configurations (administration check). |
|
Check the BizTalk Server Administration console for any non-approved changes (administration check). |
"Using the BizTalk Server Administration Console" in BizTalk Server 2006 R2 Help at http://go.microsoft.com/fwlink/?LinkId=106723 . |
Check BTSNTSvc.exe.config for any non-approved modifications (administration check). |
"BTSNTSvc.exe.config File" in BizTalk Server 2006 R2 Help at http://go.microsoft.com/fwlink/?LinkId=106724 . |
Check the BizTalk Server-related registry keys for any non-approved modifications (administration check). |
"Windows registry information for advanced users" article at http://support.microsoft.com/kb/256986 |
Run the Best Practices Analyzer for BizTalk Server (administration check). |
"BizTalk Server 2006 Best Practices Analyzer" article at http://go.microsoft.com/fwlink/?LinkId=83317 |
Ensure that the latest service packs and updates are installed (administration and security check). |
Microsoft Update site at http://update.microsoft.com/microsoftupdate/v6/default.aspx |
Ensure that the artifacts for different trading partners are not installed on the same host (security check). |
|
Ensure that BizTalk Server is using only domain-level users and groups (security check). |
"Domain Groups" in BizTalk Server 2006 R2 Help at http://go.microsoft.com/fwlink/?LinkId=106725 . |
Ensure that the MSDTC Security Configuration is enabled (security check). |
"Set the appropriate MSDTC Security Configuration options on Windows Server 2003 SP1 and Windows XP SP2" entry in "Troubleshooting Problems with MSDTC" in BizTalk Server 2006 R2 Help at http://go.microsoft.com/fwlink/?LinkID=101609 . |
Determine whether the BizTalk Server cache refresh interval needs to be increased (performance check). |
|
Determine whether the throttling options of each host need to be adjusted (performance check). |
|
Determine whether unnecessary tracking is enabled, such as orchestration, shape, and Business Rule Engine (BRE) event tracking (performance check). |
|
Determine whether you are using a dedicated host for tracking maintenance (performance check). |
|
Determine whether the default XML send pipeline is being used instead of the PassThrough send pipeline (performance check). |
"Managing Send Ports Using BizTalk Explorer" in BizTalk Server 2006 R2 Help at http://go.microsoft.com/fwlink/?LinkId=106727 . |
Check the BizTalk Server database sizes for an increasing trend (performance check). |
|
Determine whether the system is encountering database contention (performance check). |
For more information about avoiding contention in the MessageBox database, see Avoiding Disk Contention . |
IBM Lotus Domino服务器维护 清单
Task |
Frequency |
Back up the server |
Daily, weekly, monthly |
Monitor mail routing |
Daily |
Run Fixup to fix any corrupted databases * |
At server startup and as needed |
Monitor Administration Requests database (ADMIN4.NSF) |
Weekly |
Monitor databases that need maintenance |
Weekly |
Monitor replication |
Daily |
Monitor modem communications |
Daily |
Monitor memory |
Monthly |
Monitor disk space |
Daily, weekly, monthly |
Monitor server load |
Monthly |
Monitor server performance |
Monthly |
Monitor Web server requests |
Monthly |
Monitor server first domino servers |
Daily |
2011年01月29日 星期六 09:16
The Basics | ||
Hardware Manufacturer: | ||
Model Number: | ||
Serial Number: | ||
Tower/Rack/Blade | ||
Physical Location of Server: | ||
Purchase Date: | ||
Warranty/Service Contract Number: | ||
Warranty/Service Telephone Number: | ||
Date Warranty Expires: | ||
CPU | ||
Number of CPU Sockets: | ||
Number of Installed CPUs: | ||
CPU Model: | ||
CPU Ghz Speed: | ||
Number of Cores per CPU: | ||
Type of Hyperthreading: | ||
Is Hyperthreading on or off: | ||
CPU L2 Cache Size: | ||
CPU Bus Speed: | ||
Motherboard BIOS Version: | ||
Is BIOS Version Current: | ||
Memory | ||
Current Amount of RAM: | ||
Additional RAM Capacity Available: | ||
Number of Memory Slots Used: | ||
Number of Memory Slots Available: | ||
ECC Memory: | ||
Network Adapter | ||
Hardware Manufacturer: | ||
Model Number: | ||
Speed: | ||
Number of Ports per Card: | ||
Number of Cards: | ||
BIOS Version Number: | ||
Is BIOS Version Current: | ||
NIC Speed/Duplex Setting: | ||
Is the NIC Power Saving Feature Off: | ||
Storage | ||
Type: Local, DAS, SAN, Combo: | ||
Local/Integrated RAID Controller | ||
Number of Local RAID Controllers: | ||
Type: SCSI, SAS, etc. | ||
Controller Hardware Manufacturer: | ||
Number of Ports: | ||
Controller Model Number: | ||
Controller Cache Size: | ||
Is There a Cache Battery: | ||
Is Write Back Caching On: | ||
Controller BIOS Version Number: | ||
Is Controller BIOS Version Current: | ||
External RAID Controllers | ||
Number of External RAID Controllers: | ||
Type: SCSI, SAS, etc. | ||
Controller Hardware Manufacturer: | ||
Controller Model Number: | ||
Number of External Ports: | ||
Controller Cache Size: | ||
Is There a Cache Battery: | ||
Is Write Back Caching On: | ||
Controller BIOS Version Number: | ||
Is Controller BIOS Version Current: | ||
Local Disk Configuration | ||
RAID Configuration: | ||
Number of Physical Drives: | ||
Physical Dimension of Drives: | ||
Drive Capacity: | ||
Drive Speed/RPM: | ||
Total Available Disk Space: | ||
HBAs for External Storage | ||
Number of HBAs: | ||
Type: iSCSI, Fibre Channel, etc: | ||
Type of Connectors: | ||
HBA Hardware Manufacturer: | ||
HBA Model Number: | ||
HBA BIOS Version Number: | ||
Is HBA BIOS Version Current: | ||
DAS Disk Configuration | ||
RAID Configuration: | ||
Number of Drives: | ||
Physical Dimension of Drives: | ||
Drive Capacity: | ||
Drive Speed/RPM: | ||
Total Available Disk Space: | ||
SAN Disk Configuration | ||
SAN Manufacturer: | ||
SAN Model: | ||
iSCSI, Fibre Channel, etc: | ||
SAN Cache Capacity: | ||
SAN Software Version: | ||
Is SAN Software Current: | ||
Number of Attached LUNs: | ||
RAID Configuration per LUN: | ||
Number of Drives Used per LUN: | ||
Capacity of Drives Used in LUNs: | ||
Speed of Drives Used in LUNs: | ||
Available Disk Space per LUN: | ||
Are LUNs Shared or Dedicated: | ||
High Availability | ||
Redundant Power Supplies: | ||
Redundant NICs: | ||
Redundant Controllers: | ||
All Components Connected to UPS: | ||
Is Server Physically Secure: | ||
If Cooling Required, is it Redundant: | ||
Clustering | ||
Number of Cluster Nodes: | ||
Number of Active Nodes: | ||
Number of Passive Nodes: | ||
Type of Quorum: | ||
Type of Shared Storage: | ||
Are HBAs Redundant: | ||
Are Storage Switches Redundant: | ||
Are NIC Switches Redundant: | ||
Are NICs Redundant: | ||
Backup | ||
Tape Drive: Internal/External: | ||
Tape Drive Manufacturer: | ||
Tape Drive Model: | ||
Local Disk: | ||
DAS Disk: | ||
SAN Disk: |
|
Daily Operations Checklist
Checklist: Performing Physical Environmental Checks
Use this checklist to ensure that physical environment checks are completed.
Task:
· Verify that environmental conditions are tracked and maintained.
· Check temperature and humidity to ensure that environmental systems such as heating and air conditioning settings are within acceptable conditions, and that they function within the hardware manufacturer's specifications.
· Verify that physical security measures such as locks, dongles, and access codes have not been breached and that they function correctly.
· Ensure that your physical network and related hardware such as routers, switches, hubs, physical cables, and connectors are operational.
Checklist: Check Backups
Task:
· Make sure that the recommended minimum backup strategy of a daily online backup is completed.
· Verify that the previous backup operation completed.
· Analyze and respond to errors and warnings during the backup operation.
· Follow the established procedure for tape rotation, labeling, and storage.
· Verify that the transaction logs were successfully purged (if your backup type is purging logs).
· Make sure that backups complete under service level agreements (SLA).
· Checklist: Check CPU and Memory Use
· Use this checklist to record the sampling time of each counter.
Checklist: Check Disk Use
Follow the checklist and record the drive letter, designation, and available disk space.
Task
· Create a list of all drives and label them in three categories: drives with transaction logs, drives with queues, and other drives.
· Check disks with transaction log files.
· Check disks with SMTP queues.
· Check other disks.
· Use server monitors to check free disk space.
· Check performance on disks.
Drive Letter |
Designation (drives with transaction logs, drives with queues, and other drives) |
Available space MB |
Available % free |
Your data here |
|
|
|
Your data here |
|
|
|
Your data here |
|
|
|
Checklist: Event Logs
Check event logs using the following checklist.
Task
· Check application and system logs on the server to see all errors.
· Check application and system logs on the Exchange server to see all warnings.
· Note repetitive warning and error logs.
· Respond to discovered failures and problems.
Weekly Maintenance Checklist
Checklist: Create Reports
Use this checklist to create status reports to help with capacity planning, service level agreement (SLA) reviews, and performance analysis.
Task:
· Use daily data from event log and System Monitor to create reports.
· Report on disk usage.
· Create reports on memory and CPU usage.
· Generate uptime and availability reports.
Checklist: Incident Reports
Use this checklist to create incident reports.
Task
· List the top generated, resolved, and pending incidents.
· Create solutions for unresolved incidents.
· Update reports to include new trouble tickets.
· Create a document depository for troubleshooting guides and post- mortems about outages.
Checklist: Antivirus Defense
Use this checklist to perform your antivirus defense.
Task
· Perform a virus scan on each computer.
· Check anti-virus definition updates timely.
Checklist: Status Meeting
Use this checklist to conduct weekly status meetings during which the tasks are reviewed.
Task
· Server and network status for the overall organization and segments.
· Organizational performance and availability.
· Overview reports and incidents.
· Risk analysis and evaluation including upcoming changes.
· Capacity, availability, and performance reviews.
· Service level agreement (SLA) performance, and review items that have not met target objectives.
Zeuux © 2024
京ICP备05028076号