POL00397094 - Draft Report on Summary of Charges : Fujitsu and POL.

Evidence on official site

POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Document Title: [TITLE \* MERGEFORMAT ]
Document Type: [DOCPROPERTY "Document Type" \* MERGEFORMAT ]
Release: NA

Abstract: [COMMENTS \* MERGEFORMAT ]

Document Status: DRAFT

Author & Dept: [AUTHOR \* MERGEFORMAT ]

Internal Distribut

External Distribution:

Approval Authoritie

Nan gnatur
David Chapman Systems Qualitiesy Architect
Geof Slocombe Infrastructure Design

Note: See Post Office Account HNG-XHNG-X Reviewers/Approvers Role Matrix (PGM/DCM/ON/0001) for

gutiance
@Copyright Fujitsu Services Lid [SUBJECT \* MERGEFORMAT ] Ref: [DOCPROPERTY
2008? ‘Document Number \
MERGEFORMAT }
Version; 01488
[KEYWORDS \'MERGEFORMAT] Date: I 23148-Ju\layNov-087

UNCONTROLLED IF PRINTED. Page No: tof 126

POL-BSFF-0223764
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ] e
FUJITSU [SUBJECT \* MERGEFORMAT ]

0 Document Control

0.1 Table of Contents
[ TOC \0 "1-3" \H \Z \T "POA APPENDIX HEADING 1,1,POA APPENDIX HEADING

22")
Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23423-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 2of 126

POL-BSFF-0223764_0001
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \" MERGEFORMAT ]
0.2 Document History
ersion No. I Date a ated Change
/PPRR
Reference
on Inal Retease Pana
0.2 (23-Nov-2007 Incorporated initial comments and updates from meeting with
Wil Russel and John Halface and reviewers
93 Q1-MayFeb- Incorporate comments
208 Added ref to DEV/GEN/SPE/0007 for PIM failure
rvs 7a.iui2008 I incororate comments and reformat lest secionto use esa
summary table inthe ovewiew document,
Incorporate changes to EST platform services after informal
comments ftom Pat Caro
Changes to VPN platforms cooz19
Undate to reflect latest network desion P0097
Reference made to RAD changes crore
KMS moved into "ame wth sofware random number P4506
aenerator
0.3 Review Details  - ws: nor sussect to APPROVERS & REVIEWERS ROLE MATRIX
Review Comments by EriMonday, 29460 AugustDecember May 20087
[HYPERLINK ‘mailto:Edward ashfordl "GRO & [HYPERLINK
“mailto:PostOfficeAccountDocumentMaiiagemnernI
Role Name
Developmentiusiness-Continuly Manager Paul Stewart, Joseph DiffinFony Wicks (Formatted Table }
Architecture Systems Quality Architect David ChapmanDawid-Chapman
DevelopmentSSC Manager ‘Adrian WestMik Peach*
SSC Mik Peach
Business Continuity Tony Wicks
Migration Architect " Brian RidleyPeter .
System Test John Rogers
Name
‘Apologies ifroles-are not quite right-new
announcement wasnt too explicit
Programme ManagerNetwork Architect Phil DayMark Jarosz {Formatted Table
©Copyright Fujitsu Services Lid [SUBJECT \* MERGEFORMAT ] Ref: [TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT }
Version: 0,432
Date: (23423-JulyMayNev-087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: Sof 126

POL-BSFF-0223764_0002
FujiTsU

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

Applications Architecture Dave Johns
‘Architect Jason Ciark
Security Architect sim Sweeting
Test Desian eter Robinson
Test Desian ik
Head of Service Management ‘Steve Denham
Head of Service Change & Transition Graham Welsh
Service Support Peter Thompson
Service Network ‘Alex Kemp

Data Centre Migration

Caroline Montgomery

Infrastructure Design

Geof Slocombe

Testing Peter Dreweatt
‘SV&l Manager Shella Bamber
Tester Hamish Munro

RV Manager James Brett (POL)

Vi TE Manager Peter Rickson

HNG-X Acceptance & Risk Wayne Roberts (POL)
Integrity Testing ‘Alan Child

integrity Testing Michael Welch

Cote Services Mark Walsh

Gore Services ‘Andrew Gibson

Business Architect

Gareth Jenkins

Jim Sweeting
Security Architect
‘Systeme & Eslale Management Architect fan Bowen
Head-of Engineering Barbara Perek
Quality Jan-Holmest
coe Witla Member
cro. Malt AdbyGiacomo Piccinelii
Requirements & Architecture Manager Martin. GaitJohn-Lake
Implementation & Transition PM. Martin BretiGaroline Montgomery
Head ofjntastiucture Dave SackmanGeoft Slocombe
Head of Test Pete Drewealt
‘BCopyight Fujisu Services Lia TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number” \*
MERGEFORMAT ]
Version: 0.432
Date: 23423-JulvMayNow-087
UNCONTROLLED IF PRINTED EKEYWORDE!'Y WERGEFORMAT] = PagaNe: 4of 126

POL00397094
POL00397094

POL-BSFF-0223764_0003
[TITLE \* MERGEFORMAT ]

FUJITSU [SUBJECT \* MERGEFORMAT ] @&

Programme Manager PhilDay

Infrastracture Project Manager Dean Parsonshiike Brady

Piatform & Storage Architect Jason Ciark

POLES Migration Design Chvle-Credland Joseph Difin

Network & DNS Design Dave Haywood

Seournty Architect en Sweeting

Network Design Dave Tanner

Network Design Andrew Oram

Test Architect Peter Robinson

SSC-& Time Services Design Pat Carroll

Design ‘Fom Northcott

{5 Project Manager Pat LywoodMadcWaieh

UniiDBAINT Support Andrew Gibson?

suc lan Cooley

Network Support Dave Jackson

Migration Architect Brian RidleyJeremy Worrell

Leeann Bere sobs kaos ta staal

D,-Mills.C,-Mital-U, Morris T, Noad P, Olubor-J, Siddiqi
Tomlinson-P, Walton White-N-Williams-A, Wright M

Postion/Role Name

HING-XHNG-x Implementation Manager, MG __I Will Russell"

Programme & Project Manager—Solutions Group _I Andy HealhMike Jackson

(*) = Reviewers that retumed comments

‘BCopyight Fujisu Services Lia TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number” \*
MERGEFORMAT }
Version: 0.432
Date: 23423-JulvMayNow-087
UNCONTROLLED IF PRINTED TKEYWORDS V"MERGEFORMAT] —— Page No: 5of 128,

POL00397094
POL00397094

POL-BSFF-0223764_0004
FUJITSU

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

0.4

I>

SGopyraht Fujisu Senices Lid TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 "Document Number” \*
MERGEFORMAT }
Version: 0.432
. Date: 234123-JulvMayNov-087
UNCONTROLLED IF PRINTED TKEYWORDS V" MERGEFORMAT] Page No: Gof 128,

POL00397094
POL00397094

POL-BSFF-0223764_0005
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]
I Associated Documents (Internal & External)
+ (Formatted Table )
PGMIDOM/TEMI0001 I 1.0 13/6/06 Fujitsu Services Post Dimensions
Office Account HNG-
(00 NOT REMOVE) XHNG-X Document
Template
ARCISOL/ARC/0001 I , HNG-XHNG-X Overall Dimensions __—-(Formatted: Not Highlight )
Solution Architecture
REQI/CUS/STG/0001 HNG-XHNG-X Migration _I Dimensions __—( Formatted: Not Highlight }
Strategy - Agreed
‘Assumptions and
Constraints
comicusiscHio011 I 1.0 31/8106 Schedule B2- Business _I Dimensions (Fo fr Not Highlight. }
Continuity (CGN 1200)
comeusiscwi004 I20 I 2507 ‘Schedule B33-HNGx I Dimensions (Formatted: Not Highlight )
Central and
Telecommunications
infrastructure. (CCN 1200
FsBAUSPRMOIsV I, Business Continuity Dimensions -(Formatted Table }
MSDM/SIP/0001 Framework ornate 4
SVMSDMPLAVO001 I, HNG-X Business Dimensions :
Continuity Suppor crs )
Services Test Plan
‘SVM/SDMIPLAV0002 HING-X Business Dimensions
Continuity Services Test
Plan
SVM/SDMPPLA/0003. I HNG-XHNG-X Business _I Dimensions —( Formatted: Not Highlight }
Continuity Operational
Test Pian
‘SvMSDM/SDI0003 I 1.4 5/3/08 Data Centre Operations —_I Dimensions _—( Formatted: Not Fighight }
‘Service: Service
Description
ARC/GENIREP/0001 HING-XHING-X Glossary _I Dimensions + {Formatted Table }
ARCIPERIARCIO001 System Qualities Dimensions ~~{ Formatted: Not Highight }
Architecture
ARC/SEC/ARC/0003 ‘Security Architecture Dimensions
ARCIPPS/ARC/0001 Platforms and Storage Dimensions
Architecture
I ARCIAPP/ARC/0005 Branch Database Dimensions
Architecture
I ‘ARCIAPPIARC/0008 ‘Online Service Dimensions
Architecture
ARC/SYM/ARC/0001 HNG-XHNG-X System and I Dimensions
Estate Management
Overall Architecture
‘Copyright Fujitsu Services Ltd [SUBJECT \* MERGEFORMAT ] Ref: [TDOCPROPERTY
I 20087 “Document Number” \*
MERGEFORMAT }
Version: 0.432
[KEYWORDS \" MERGEFORMAT] pee 223-JulyMayNov-087

UNCONTROLLED IF PRINTED. Page No: of 126

POL-BSFF-0223764_0006
POL00397094

POL00397094
[TITLE_\* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
— (Formatted Table 7)
ARCISYMARC/0003 HNG-XHNG-X System and I Dimensions
Estate Management
Monitoring
ARCIGENSTD/0002 (TTe Dimensions
MERGEFORMAT ]
INB Ref may change]
TST/SOTIHTP/O006 HNG-XHNG-X: ITU VI Dimensions +——[Formatted Table 7}
Business Continuity High
Level Test Plan
DES/MIGIHLD/0001 HING-XHNG-X Migration I Dimensions
High Level Design for
Branches
DESIMIGIHLD/0002 HNG-XHNG-X Migration I Dimensions
High Level Design for Data
GentreData Centres
DESISYM/HLD/0015 HNG-XHNG-X Backup and I Dimensions
Recovery High Level
Design
DES/APP/HLD/0020 HNG-XHNG-X Branch Dimensions
Database Design
DES/PPS/HLD/0009 HNG-XHNG-X Platform I Dimensions
Type List
DES/PPS/HLD/0007 Storage High Level Design I Dimensions
DESIPPS/HLD/0003, Active Directory High Level I Dimensions
Design for HNG-XHNG-x
DES/PPS/HLD/0025 HNG-HNG-x BladeFrame I Dimensions
and PAN High Level
Design
DES/SYM/HLD/0001 MON - Supporting Dimensions
Platforms
DES/SYM/HLD/0002 MON - Supporting Agents I Dimensions
DES/SYM/HLD/0003 MON - Horizon Support I Dimensions
DES/SYM/HLD/0004 MON - Usability Dimensions
DESINET/HLD/0006 Domain Naming System I Dimensions
High Level Design
DESINET/HLD/0007 HNG-XHNG-X SAN High I Dimensions
Level Design
DEV/INF/LLD/0004 HNG-xHNG-x Storage Dimensions
Design
DESINETHLD/0008 Bata CentreDala Centre I Dimensions
LAN Design
DESINET/HLD/0009 HNG-KHING-X Wide Area I Dimensions
Network Design
DESINET/HLD/0014 HNG-xHNG-x Branch Dimensions
‘SCopyright Fujtsu Serves Lid TSUBIECT \ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number” \*
MERGEFORMAT]
Version:
Date: Bais luvlov 087

UNCONTROLLED IF PRINTED.

[KEYWORDS \* MERGEFORMAT]

Page No: Bof 126

POL-BSFF-0223764_0007
FUJITSU

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

ferent

Unless a specific version is referred to above, reference should be made to the current approved

0.5 Abbreviations

‘Access HLD

DESINETHLD/0015 HNG-xHNG-« Transit LAN I Dimensions
Design

DESINETIHLD/0010 ING -XHNG-x Branch Dimensions
Router Design

DESINET/DPR/0002 Design Proposal for Dimensions
Branch Router

DESINETHLD/0010 Branch Router Network I Dimensions
High Level Design

DESINETHLDI0012 Network Management Dimensions
‘System High Level Design

DES/NET/HLD/0013 Time Synchronisation High I Dimensions
Level Design

DEV/INF/LLD/0041 HNG-x Data Centre LAN I Dimensions.
LLD

DEV/NF/ON/0002 HNGxVLAN Mappings I Dimensions.

DEV/GEN/ION/0001 FTMS Configuration for the I Dimensions
TIP Gateways

DEVIGEN/ION/0002 FTMS Configuration for the I Dimensions
EDG Gateways

versions of the documents.

Abbreviation Definition

Aa ‘Alliance & Leicester

AD Rotive Directory. An implementation of lightweight directory
access protocol (LDAP) that provides central authentication and
authorization services.

ADSL ‘Asymmetric Digital Subscriber Line

AP ‘Automated Payment

‘AP-ADC ‘Automated Payment — Advanced Data Capture

‘APOP. ‘Automated Payment Out-Pay

APS ‘Automated Payment Service. Also used for the name of the
Oracle database that supports this service.

ATM ‘Asynchronous Transfer Mode. A form of _ network
communication that does not rely on each packet of data being
acknowledged before the next is transmitted.

BAL Branch Access Layer

SOopyright Fujtsu Series Lid TSUBIECT ¥ MERGEFORMAT] Ret TDOCPROPERTY

20087 “Document Number" \*

MERGEFORMAT ]
Version: 0.432
Date: 23423-JulvMayNov-087
UNCONTROLLED IF PRINTED BREYWORDE!'Y WERGEFORMAT] = PagaNe: Sof iat

POL00397094

POL00397094
(Formatted Table 7)
(Formatted Table )
+——[Formatted Table )

POL-BSFF-0223764_0008
POL00397094

POL00397094
[TITLE_\* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
BCT Business Continuity Test
COL EMC Clariion Disk Library. Brand name for a type of VTL.
caw Fujitsu Services Backbone Network, a private managed
etwork operated by Fujitsu Services
CAPO Card Account Post Office. An extemal client providing banking + {Formatted Table }
services.
Cisco ACE Application Control Engine. ACE polls a number of possible
service providers and advertises an external virtual address for
the service based on pre-determined selection criteria. Th
common means of providing resilience for web services.
The full term is used to avoid ambiguity as ACE is a common
abbreviation.
COTS ‘Commercial off the shelf
DC Data GentreData Centre
Des Debit Card Service
DHL Definitive Hardware List. Also a courier company used to ship
components to site.
DHS Definitive Hardware Store
DMX EMC “Direct Matrix”, DMXS is the latest generation of the EMC
‘Symmetrix range of disk arrays.
DMZ Does actually stand for de-militarized zone, but is used here in + (Formatted Table )
the networking security and firewall sense to mean an
intermediate zone between two networks which affords
protection to the main servers. For example one would find a
mail proxy in the DMZ between the internet (the Rest of the
World) and the mail server on the data centreData Centre LAN.
DNS Domain Name Service
DR Disaster Recovery
DVLA Driver and Vehicle Licensing Authority
DWOM Dense Wave Division Multiplexing. A means of passing many
optical signals simultaneously along a single fibre optic path.
Typically used for long distance links where the installation cost
is very high, it allows several customers to be offered a service
where each appears to have their own dedicated link.
ECC EMC Enterprise Conirol Centre. A Storage Management
System
EDG Electronic Data Gateway
EMC Disk Array manufacturer
EPOSS Electronic Point of Sale Service; HNG-X service. that aS { Table )
Felevant to HNG-X.
SOopyright Fujtsu Senices id TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number” \*
MERGEFORMAT]
Version:
[KEYWORDS \* MERGEFORMAT] —_Date: 2S Jaov og?

UNCONTROLLED IF PRINTED.

Page No: 100f 126

POL-BSFF-0223764_0009
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
ETU Electronic Top-Up
FAD Finance Accounts Division, part of Post Office Ltd
FO fibre channel (see glossary)
FRU Field-Replaceable Unit. A part or component of a device or
system that easily can be replaced by a skilled technician
without having to send the entire device or system to be
repaired.
C8Walpha- order? I F Backbone Network, d-nelwork
operaied by Fupisu Services
FTMS File Transfer Management Service; HNG-XHNG-X process that
provides configurable file transfer services between Horizon
and Post Office Ltd. Clients. Services available include data
compression and encryption
GPRS General Packet Radio Service. A generic term covering a
number of technologies used to provide intemet connectivity for
example using a mobile phone.
HA High Availability
HBA Host Bus Adapter
ALD High Level Design
HNG-XHNG-X_ Horizon Next Generation — Plan X
IP Internet Protocol
PMP IP multi-pathing. A Solaris driver to provide resilient network
connections.
TSDN Integrated Services Digital Network
TAN Local Area Network
LINK The organisation responsible for branded and shared network
of cash machines and self-service terminals of certain member
banks and building societies in the UK, ich enables services
from one member bank or building society to be available at
cash machines of all member banks and building societies.
LPAN Logical Processor Area Network. A subset of a PAN (see
below)
ust Live System Test. A pre-production test rig built and operated
like live.
CTPDB Long Term Performance Data Base. A repository for capacity + [Formatted Table J
planning and performance management statistics.
LUN Logical Unit Number
MTAS MID / TID Allocation System.
NAS Network-attached Storage. An appliance based mechanism for
presenting storage as generic shares. NAS is very flexible, and
allows _many servers to share a file store, but does have
Copyright Fujitsu Services Ltd [SUBJECT \* MERGEFORMAT ] Ref: [TDOCPROPERTY
20087 "Document Number" \*
MERGEFORMAT]
Version
" Date: 25a JueyNow 087
UNCONTROLLED IF PRINTED: TKENWORDS! T HERCEFORMAT Page No: 11 of 126

POL-BSFF-0223764_0010
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
performance limitations compared to dedicated storage
NBS Network Banking Service — one of the A&L, CAPO or LINK
Authorisation Services
NBU Symantec NetBackup
NIC Network Interface Card
NPS Nelwork Banking Persisistencal ServiceStore
nip Network Time Protocol
‘Oracle RAC Oracle Real Application Cluster. The full term has been used
rather than the abbreviated "RAC" as this might be confused
with the Network Banking Request Authorise Confirm model.
0s ‘Operating System
PAF Postal Address File
PCI Payment Card Industry. A consortium of companies which has
developed standards for processing electronic payments,
especially relating to the security and storage of card-holder
details.
This term will usually appear as "PCI compliance”
PCI Peripheral Component Interconnect. A replacement developed
by Intel for the early personal computer bus which has migrated
to data-cenireData Centre class computers because of the
ability to share mass-produced peripheral components (e.g
network interface cards) and the software drivers for those
components. Although this seems a retrograde step it should be
remembered that today's desk top computers are considerably
more powerful than a super computer from 1985 which cost
many millions of pounds.
This term will usually appear as "PCI bus"
PAN Processor Area Neiwork. A term used to describe the overall
set of BladeFrame resources.
The altemate meaning of Primary Account Number is not used
in this document.
PIM ‘Genera Power Input Module
POA Post Office Account + [Formatted Table )
POL Post Office Lid
POL-FS Post Office Lid Finance System. SAP based system providing
nancial accounting for the branch based business based in
Fujitsu Services Data CentreData Centres.
POL-MIS Post Office Ltd Management Information System based in the
Northern Data CentreData Centre
RAD Real-time Active Dashboard. Event filtering service based on
Tivoli Netcool used to provide a business view of events.
SOopyright Fujtsu Senices id TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
¢ Date: 23123-JulyMavNow-087
UNCONTROLLED IF PRINTED EREYWORDE!'V WERSEFORMAT] == PagsiNe: I “aortas

POL-BSFF-0223764_0011
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]

FU TSU [SUBJECT \* MERGEFORMAT ]
RHEL Red Hat Enterprise Linux +——(Formatted Table }
RPO Recovery Point Objective
RTO Recovery Time Objective
SAN ‘Storage Area Network based on fibre channel protocol
SAS Secure Access Server
‘sCcw ‘Security Configuration Wizard
‘SDH synchronous digital hierarchy standard developed by the

Intemational Telecommunication Union (ITU), documented in

standard G.707 and its extension G.708
SLT Service Level Target
SOx ‘Sarbanes-Oxley
SPOF Single Point Of Failure + [Formatted Table )
‘SRDF EMC proprietary term for storage replication between sites "Site

Remote Data Facility”.
‘SRRC Service Resilience and Recovery Catalogue. A document

describing the failure modes, business impact, events raised,

and recovery mechanism for each service.
‘SSH Secure Shell
SSL ‘Secure Socket Layer
TES Transaction Enquiry Service
TPS Transaction Processing Service; Horizon service that formats

data for transmission to POL-MIS and POL-FS and other places:
TWs Tivoli Workload Scheduler. A batch scheduling system. This is

the new name for the latest version of the Unison Maestro

scheduler used in Horizon.
vai Volume and integrity Test Rig. A full-scale test rig used for

performance testing and non-functional te:
VIP ‘Virtual IP
VLAN Virtual LAN. Larger switches can operate as if they were a

number of separate logical switches. This allows a larger

number of services to be taken down when a switch fails.
VSAN Virtual SAN. Ditto. Although IP and fibre channel differ, the

approach used by Cisco has strong analogues for virtualisation.
VPN Virtual Private Network
VIL Virtual Tape Library
WAN Wide Area Network’ ~ {Formatted Table )
Copyright Fujitsu Services Ltd [SUBJECT \* MERGEFORMAT ] Ref: [TDOCPROPERTY
20087 "Document Number" \*

MERGEFORMAT]
Version
" Date: Stet JueyNow 087

UNCONTROLLED IF PRINTED: TKENWORDS! T HERCEFORMAT Page No: 13 of 126

POL-BSFF-0223764_0012
FUJITSU

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

0.6 Glossary

trunking, trunked

nition

Used of network connections, Trunked describes a switch to switch interface, where
several VLANs may be shared between the switches using a single physical network
port. This allows great flexibility in shaping the physical network architecture using
core-edge toplogies, and in effect this is how BladeFrame is leveraging network
virtualisation,

KB, MB, GB, TB
mbps, gbps

There are no ISO standards for “byte” and “bit”, This document will use the terms on
the left to indicate kilobytes, megabytes, gigabytes, terabytes where:

kB = 1024 bytes

MB = 1024 kB

GB = 1024 MB

TB = 1024 GB

Note that disk salesmen often quote gigabytes as 1000 MB
mbps = megabit per second

abps = gigabit per second

Normal Ethemet speeds are 10/10/1000 mbps.
100 mbps roughly equates to 7 MB/s.

Normal fibrechannel speeds are 1/2/4 gbps

1 gbps roughly equates to 70 MB/s:

The “rounding errors" are down to things like frame headers, checksums and
queuing on shared links.

Other common network speeds are 34 mbps (E3 or T3 leased line) and 155 mbps
(ATM over SDH).

Readers who wish to explore this further should follow the links on Wikipedia which
are easily accessed via the page on Fibre channel

A knowledge of these terms is not essential for reading this document.

fibre channel

Fibre Channel is a gigabit-speed network technology primarily used for storage
networking. Despite common connotations of its name, Fibre Channel signaling can
run on both twisted pair copper wire and fibre-optic. It is also possible to run IP
connections over fibre channel, but the advent of gigabit ethernet has made this less.
common.

http://en.wikipedia.org'wikiFibre_channel

switched fabric

Commonly abbreviated to fabric, and specifically used here to mean fibre channel
switched fabric (FC-SW)

‘A fabric is a network of devices connected by switches. Each is identified by a worikd
wide node name (WWNN). Many VSANs can be created in each fabric.

In a typical resilient deployment mirrored fabrics are used where each VSAN has a
counterpart in the other fabric, and resilient paths between hosts and storage devices
are provided through both of the mirrored fabrics.

Ne

This is a term commonly used in describing resilience. if two servers are required to
provide adequate performance, and any server can perform all the tasks required,

Copyright Fujtsu Services Lid
20087

UNCONTROLLED IF PRINTED.

TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
“Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \'MERGEFORMAT] Date: 23423-JulyMayNov.087

Page No: 14 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0013
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Term Definit

then deploying three servers provides one more than Is required (N¢1), and the
failure of a single server does not reduce performance below the level required

Hydra The term hydra is used to avoid ambiguity which the term "Horizon Branch Services"
‘would otherwise introduce during migration

During the counter migration phase a "Hydra" component will remain to support
Horizon counters. This will continue to operate Active/Active, but will be removed
once the last counter has migrated to the HNG-« application.

Hydra ends when there are no more Horizon counters, but note that SYSMAN2 is
required untilall NT systems are upgraded or removed

stateless Used specifically in this document to mean a system which does not need to store

information about its previous state. The opposite term stateful is used to define a
system which has some data that would need to be recovered in the event that the
system was lost.

In designing recovery solutions stateless servers are very flexible as they merely
heed to be rebuilt or restarted, whereas stateful systems need to be failed over or
recovered,

0.7 Changes Expected

nge

‘Network and migration design are still work in progress, and both have an impact on DR and HA design, however
rho substantial changes are expected

The list of platform types is still subject to change. The design of BMX is not complete

0.8 Accuracy
Fujitsu Services endeavours to ensure that the information contained in this document is correct but, whilst every

effort is made to ensure the accuracy of such information, it accepts no liability for any loss (however caused)
sustained as a result of any error or omission in the same.

0.9 Copyright

© Copyright Fujitsu Services Limited (2007), All rights reserved. No part of this document may be reproduced,
‘stored or transmitted in any form without the prior written permission of Fujitsu Services,

Copyright Fujtsu Services Lia TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \'MERGEFORMAT] Date: 23429-JulvMayNov 087

UNCONTROLLED IF PRINTED. Page No: 150f 126

POL00397094
POL00397094

POL-BSFF-0223764_0014
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]
1 Introduction
Esti. need-to-produce-theAn overview is provided which details the platforms and summarises the DR __—{ Formatted: Not Highlight }
and resilience models in tabular form.
THE DOCUMENT IS INTENDED TO INFORM APPLICATION
DESIGNERS ABOUT THE RESILIENCE AND DR SOLUTIONS,
pesiceoNoon TECHNICAL OVERVIEW OF DESIGNFOR INCLUDING BACKUP & RECOVERY, SO THAT THE SYSTEM
BUSINESS CONTINUITY ‘QUALITIES CHAPTER IN THE APPLICATION HLD PROVIDES THE
INFORMATION REQUIRED BY THE BACKUP DESIGNER AND THE
DRAN
4.4-Business continuity is a complex subject. This-—— (Formatted: Bulets and Numbering )

document is not intended to cover every aspect of
business continuity design, planning ot operation, but to outline the various technologies chosen, to cover
all services at a high level, and to direct the reader to those areas of architecture, design and business
continuity planning where the appropriate detailed view may be found.

4.21.1 Scope

This document covers two aspects of the high-level architectural design as it relates to the steady-state
hosting of the HNG-XHNG-X solution — resilience and disaster recovery.

This document describes the key architectural elements which support both resilience and disaster
recovery, and defines a number of classes of hosted service.

For each class of service, this document outlines the high level procedures for recovering from failures,
and for recovering service back to the primary data-centreData Centre in case of a disaster which forces
service to the secondary data centreData Centre.

Disaster recovery during the daia-centreData Centre migration will be covered at a high level. I
expected that related groups of services will be migrated together, and that each group of services may
use Horizon or HNG-xHING-x DR mechanisms independently.

4.31.2___Not in Scope
This document assumes a general knowledge of the HNG-XHNG-X architecture.
Backup and Recovery design covers the recovery from data corruption.

Enhanced Agent and Correspondence Server Resilience and Recovery (EACRR) for Hydra will be
covered in a separate design document DES/PER/HLD/0003 HNG-x Branch Trading Resilience Design.
Resilience of an application is the responsibility of the application designer. Guidelines for designing a
resilient service and a number of technology options are presented in this high level design, and a
‘summary of the approach adopted by each service, but the reader is referred to the application high level
design for further details.

‘This document does not cover the provision of secondary sites for staff, such as the SMC at Stevenage
and the SSC at Bracknell. The need to provide such provision is covered by the Schedules for those

Copyright Fujtsu Services Lia TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23423-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 16 of 126

POL-BSFF-0223764_0015
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

services (this includes the Development and Integration functions) and the only requirement that is in
‘scope for this document is that the network architecture should allow any of these sites to connect to
either data centre. These are covered by the Business Continuity Plans for each service, and are
‘substantially unchanged from Horizon.

‘The requirements of these secondary sites are principally driven by the need to keep operating the live
service (SVM/SDM/SD/0003) and the decision as to whether we require a "warm" or “cold” option is
driven by risk and cost rather than design. As far as possible I have tried to encourage access from
standard corporate workstations for support staff which reduces the cost of these provisions and
maximises flexibility, but there are clearly cases, either for reasons of bandwidth or security, where this is
not possible,

4.41.3 Changes from Horizon

In Horizon the data centreData Centres operated in an Active/Active mode, the counters connect to the
Riposte asynchronous messaging store. They could connect to servers in either site
In HNG-XHNG-X the data-centreData Centres operate in a Production/Test mode, where the Standby

site is used for system testing. This is facilitated by the deployment of BladeFrame which simplifies the
definition of virtual sets of servers that exist in the SAN rather than in a physical implementation.

Copyright Fujtsu Services Lia TSUBJEGT ¥ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \\MERGEFORMAT] — ate! Be tle 087

UNCONTROLLED IF PRINTED. Page No: 17of 126

POL00397094
POL00397094

POL-BSFF-0223764_0016
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

2 Resilience and Disaster Recovery

These two topics, supported by backup and recovery and estate monitoring, define how the overall
Business Continuity Plan is to be implemented. SVM/SDM/PLA/0003 is the Business Continuity Plan.

Requirements are principally derived from the following sources:
COM/CUS/SCH/0011 - Schedule B2 - Business Continuity
COM/CUS/SCH/0014 - HNG-x Central and Telecommunications Infrastructure
SVM/SDM/SD/0003 - Data Centre Service: Service Description

Note_that_other_services_may have expressed requirements _in DOORS which relate to_these
requirements. This document does not attempt to reconcile all such requirements, and for a particular
service the service HLD should perform this function.

Although there are many other schedules and the scope of business continuity is considered in its widest

sense the use of a web-services architecture with a central data repository means that the Data Centre
service description serves to capture the requirements of the end-to-end service quite neat!

Schedule B2 is met by the Business Continuity Framework and Data Centre Service Description. No
further direct reference will be made to it here.

ARC/SOL/ARC/0001_Overall Solution Architecture, especially Chapter 6, provides an_architectural
context for this design.

ARC/PER/ARCIO001 System Qualities Architecture summarises the requirements of the service for
availability and performance.

DES/SYM/HLD/0015 Backup & Recovery High Level Design deals with recovery from data corruption at
an application level. This is outside the scope of DR, as the corruption may have also corrupted the data
at the secondary site and DR is not an appropriate means of recovery. It is generally part of application
resilience.

ARC/SYM/ARC/0001 and ARC/SYM/ARC/0003 describe the overall approach to Estate Management,
and specifically to monitoring the estate. If a problem is not detected, then all the plans are not going to
be put into effect. The timeliness of detection and the response to any alarms raised are part of the
overall impact to the customer of the service outage, although specific service level agreements may
break the response down into smaller units that are more easily measured.

2.1 Definitions

Resilience relates to elements of the hosting solution which provide tolerance to faults within the primary
data centreData Centre; the main design goal being that any single failure will not prevent the application
from continuing to work within the primary data-centreData Centre

Disaster Recovery relates to the elements of the hosting solution which allow hosting of the solution
from a secondary data centreData Centre in event of a catastrophic failure at the primary data centreData
Conte wih very iileor no toss oF data

Copyright Fujtsu Services Lia TSUBJEGT ¥ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \\MERGEFORMAT] — ate! Bare litle 087

UNCONTROLLED IF PRINTED. Page No: 18 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0017
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

2.2 Approach to Resilience

The aim of providing resilience within the data-centreData Centre is to allow the solution to survive the
failure of a single part of the infrastructure. As far as possible, this should be extended so that multiple
failures can be withstood, to a commercially reasonable level.

In addition to this, the design of the resilience should also provide for
minimisation of any outage required to replace or fix failed parts.

plified maintenance and

In the event of a major environmental problem, such as a major fire at either site, the remaining site is
designed to be N+1 resilient for all business critical components.

Section 6.2.3 details single points of failure that are declared in the design. These are covered by risk
management, and are typically services that can be bought if a disaster really occurs or which would be
too expensive to justify the risk that is mitigated,

2.3 Approach to Disaster Recovery

In the event of a catastrophic failure, that is, a failure which renders the primary data centreData Centre
unable to host the solution in a commercially viable manner, the hosting of the entire solution will move to
the secondary site (a “site failover”),

To support this, all servers, network switches, routers, firewalls, storage and supporting infrastructure at
the primary site will be duplicated at the secondary site, and will operate in manner where they are
permanently ready to fail over. This places limitations on the use of such components for testing, and it
may be necessary to deploy dedicated test equipment at the secondary site where use of DR equipment
is prohibited by such requirements.

Following a site failover it is assumed that any issue or failure at the primary site will be resolved, and that
the hosting of the solution will move back to the primary site (a “site failback”). A failback is disruptive to
the branch service, and failback will always be a planned event that aims to minimise the service outage
of fallback. Such service outages do not normally count towards SLA targets. With the increasing
likelihood of 24 x 7 counter operations it may be much more difficult than in Horizon to agree the timing
for failback.

2.4 Approach to Development and Testing

For any DR solution at least 80% of the success depends on procedures and processes rather than the
technical implementation. In addition key operational staff need to stay familiar with these processes, and
the processes themselves need to be Kept up to date as the solution develops during its life cycle, or as
problems and work-rounds are found during live use or testing

Fujitsu Services operate a series of business continuity tests under the management of the Business
Continuity Manager reporting to the Business Risk Manager and various Service Managers. These tests
are run in the live environment, and are designed to be as realistic as possible without putting the service
deliberately at risk.

The HNG-xHNG-x solution has a lot of elements that will be very familiar to Horizon support staff, such as
the Legacy Batch Server, Oracle RAC based HA databases, and the POLFS system. There are other
elements such as network banking agents which look similar but have subtle differences, and yet other
areas such as the deployment of BladeFrame server virtualisation, deployment of Oracle DataGuard and
Streams, and operation in Production/Test mode which are new.

SCopyright Fujteu Services Ud TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: 25ieh Juan?

UNCONTROLLED IF PRINTED. Page No: 190f 126

POL00397094
POL00397094

Formatted: Font: Kale
( )

POL-BSFF-0223764_0018
[TITLE \* MERGEFORMAT }
FUJITSU [SUBJECT \" MERGEFORMAT ] @&

A separate project known as "Pathfinder" is being performed in situ in Belfast to prove the migration
approach for the POLFS service. Initially this will connect the four sites to give a single SAP Landscape.
The existence of this project allows the POLFS resilience and DR approach to be validated in Belfast,
and this validated solution will be reused for the main Solaris Oracle database server (DAT) which has
similar characteristics to the POLFS XI subsystem.

The production infrastructure is being deployed a few months in advance of when it is required. This de-
risks a very complex deployment, and also permits a representative "classroom" to be assembled to
provide support staff with on-site training run in conjunction with Fujitsu-Siemens Professional Services to
introduce them to some of the new concepts.

The Data Base Administration team in Belfast are also running a representative development
environment on the pre-production estate to assist the Branch Database designers in producing the
Branch Database High Level Design (DES/APP/HLD/0020). This will ensure that the designers follow
Infrastructure Services best practices for implementing Oracle DataGuard and Oracle Streams, and that
the processes used to support the HNG-XHNG-X Solution are familiar to the DBA Team, and similar to
processes used on other accounts. Fujitsu Services are taking advantage of their experience in offering
similar services (albeit not on such a large scale) to other customers in ensuring that a reliable and
effective service is delivered for HNG-XHNG-X.

The advance infrastructure will then be available for initial DR Development work, which will mainly
consist of ensuring that the basic principles demonstrated in the classroom are turned into rigorous
processes and scripts for the actual solution,

Finally a series of failover tests has been built in to the V&l Cycle testing. These tests will be run by ITU
and operated by the actual support staff. The number of tests allows all staff to be made familiar with the
solution, and for F3 and F4 (as the final two tests are known) to be run in cooperation with the Business
Continuity Manager and their counterpart from the Customer and serve as acceptance tests. The V&l
HLTP (TST/SOT/HTP/0006) should be referred to for a detailed view of this testing.

These tests will be preceded by a procedural walk-through to familiarise all the Service Delivery Units
with their role in the new solution. This is similar to the processes involved in existing business continuity
planning and rehearsal.

In addition is expected that V&I will include a programme of non-functional testing designed to test the
N+1 resilience features and recovery from backup.

Itis also a design goal to make the eventual Test environment representative enough that many of the
N+1 resilience features and their operation may be tested in the Test environment in a realistic enough
manner to satisfy many of the requirements of Business Continuity Testing without impacting on the
Production service.

2.5 Ongoing Business Continuity Testing

A series of walks-through are also performed, especially for those tests where there is considerable inter-
disciplinary interaction. These exercises are actually highly valuable in keeping the processes fresh, and
also as training exercises prior to a full business continuity test, especially where new staff are being
trained. They also provide an opportunity for junior staff members (who may find themselves at the sharp
‘end one night) to manage their team's input in a non-threatening environment.

The Business Continuity Framework (FS/8AU/SPP/004SVM/SDM/SIP/0001) and Business Continuity
Plan (SVM/SDM/PLA/0003) cover normal "Business as usual" testing, and a schedule is published
annually to coordinate this.

Inevitably a test of site DR will involve some impact to the normal service. In a Production/Test campus
pair this will also involve loss of the Test Service for the duration of the test. This is particularly noticeable
for the counters which in Horizon are able to operate autonomously as isolated Riposte nodes (although

Copyright Fujtsu Services Lid TSUBJEGT V MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: iat jute 087

UNCONTROLLED IF PRINTED. Page Ne: 20 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0019
POL00397094
POL00397094

[TITLE \* MERGEFORMAT }
FUJITSU [SUBJECT \" MERGEFORMAT ] @&

temporarily without network banking), but at HNG-xHNG-x must contact the Branch Database in the data
cenireData Centre in order to transact any business and will suffer an outage of up to two hours as the
failover occurs.

In order to minimise the impact to the business it is recommended that business continuity tests be
scheduled to fail over on Saturday at 0200 and fail back on Sunday at 0200, which will cause the
Saturday business day and main overnight batch processing to be run from IRE%9, and allow a
contingency window on Sunday to ensure readiness for Monday morning. Monday morning is when the
peak business transaction rate occurs.

SCopyright Fujteu Services Ud TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
Date: Pala Juvilaov087

UNCONTROLLED IF PRINTED TKEYWORDS V"MERGEFORMAT] — PegeNo: 21 of 126

POL-BSFF-0223764_0020
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

3. Core Architectural Elements

3.1 Data-CentreData Centres

There are two data centreData Centres, based in Northem Ireland, denoted by the Fujitsu location codes
IRE11 and IRE19,

Under normal operating conditions, IRE11 offers the Production HNG-xHNG-x service and IRE19 offers
the Test service (strictly a number of Test services).

Production components of the overall solution are active at both IRE11 and IRE19 simultaneously in
order to facilitate failover, and Test components are located at IRE11 to allow Test systems to fully test
changes to the DR solution once in service.

The network is considered active at both data centreData Centres and is managed as a single Production
entity. This is required to support the rapid failover from IRE11 to IRE19 in a disaster recovery situation

The direct distance between IRE11 and IRE19 is 6.3 kilometres (3.9 miles). Via the main roads, the
distance is 15.8 kilometres (9.8 miles).

3.2 Intercampus Link

The intercampus link is a high speed fibre link between the primary and secondary data-centreData
Centre hosting sites, comprising two redundant, diversely routed fibre links which are DWDM multiplexed
to form a number of usable logical links. The DWDM end points are separated by at least 5m at each site.
There is a detailed description in the SAN High Level Design (DES/NET/HLD/0007),

‘Over each of the diverse links there will be two 4GB-4abps Fibre Channel and two 1gbpsGB Ethemet
links

The link may be used in a number of ways during normal steady state operation:
‘+ Real-time replication of storage traffic from the primary site SAN to the secondary site SAN

‘+ Network traffic between the two sites, for example, copying of backups to the secondary site for
restoring onto test systems

The intercampus link also may be used in a number of failure scenarios:

‘+ In the case where the C&W link into the primary site fails,
traffic via the secondary site and then over the intercampus

will be possible to route network
k

‘+ In the event of site failover but where storage is still available at the primary site, the link will be
used for replicating SAN traffic in the opposite direction (i.e. from secondary to primary site).

3.3 Storage

The Platform & Storage Architecture (ARC/PPS/ARC/0001) Section 2.5.3.1 describes a number of
service classes for storage, based on performance and resilience requirements. The Storage High Level
Design (DEV/PPS/HLD/0007) provides more detail on the storage implementation.

‘Storage Service Class 1: Critical applications and databases that need high performance
and replication (RPO = 0, RTO = 0).
SGopyraht Fujisu Senices Lid TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \* MERGEFORMAT] —_Date: 2 Jaov og?

UNCONTROLLED IF PRINTED. Page No: 22 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0021
[TITLE \* MERGEFORMAT }
FUJITSU [SUBJECT \" MERGEFORMAT ] @&

Storage Service Class 2: Critical applications and databases that need high performance
and/or replication, but have extended recovery objectives

(RPO=0, RTO>24hr).

Storage Service Class 3 Other production databases, Quality Assurance (QA), test -
asynchronous replication or no replication (RPO > 0, RTO >
24h)

Storage Service Class 4: Near line file system storage — Asynchronous replication or no
replication (RPO > 24hr, RTO > 48hr)

Storage Service Class 5: Data with long term, regulated retention. Regulatory reports,

SOX Records, email, etc. - may require replication.

Storage Service Class 6 Backup and Restores / Synthetic Full Backups - VTL with
replication capabilities

Storage Service Class 7: Tape for off-site storage

These have been mapped physically onto a number of different systems, which happen to all be provided
by EMC except for Class 7, although other vendors provide similar functionality.

EMC were selected in a rigorous bidding process as the best overall supplier. This judged the ability of
the vendor to support the service long term as well as meeting the requirement, and was not simply a
lowest cost bid, although the alternative bids were used to secure a competitive price from EMC. There
are advantages from a support resolution perspective of having a single supplier provide all components
ina SAN.

More detailed information can be found on EMC's website for each of these systems. This section is only
intended to give a very high-level overview.

DMX (Direct Matrix) Disk Array is the latest generation of the Symmetrix range (Symm7). The Symmetrix
provides an extremely robust and resilient storage array, fronted by a large cache for performance, and
capable of synchronous or asynchronous hardware replication, known by EMC as Site Remote Data
Facility (SRDF), over considerable distances, although large distances do introduce latency on disk
writes. The disk array provides RAID-1 and RAID-5 protection internally, and using the TimeFinder
product enables clone or split-mirror backups, known by EMC as Business Continuance Volumes (BCV),
to be managed by the array, and avoid the need for host CPU cycles to be dedicated to this task. The
Symmetrix has an internal battery backup, and even in the event of a complete site power failure (e.g. the
fire brigade shut off the power) the data will be de-staged from the cache to disk before automatically
shutting down.

Clariion Disk Array (formerly made by Data General) is a slightly less resilient disk array, which is still
capable of high performance, but not on the same scale as DMX. Limitations on the number of clones
and site replicated volumes that Clariion can manage mean that Clariion cannot carry all Class 2 & 3
storage.

CeleraCelerra Network Addressable Storage (NAS). A small appliance connects to both SAN and LAN
and allows filesystems to be presented in a heterogeneous way either to NFS based systems such as
linux and Solaris or to CIFS based systems such as Windows 2003. The use of such a facility means that
the storage may be presented directly to many systems which need to share data, and the inter-
dependence of those systems is broken, allowing lower resilience targets or longer backup windows to be
specified.

Centera Content Addressable Storage (CAS). This is used by EMC to provide a low cost and more
reliable alternative to tape for audit and archive solutions. The resilience of each array is relatively low,
but by having one at each site and allowing the audit application to manage replication a solution with

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: iat stew 087

UNCONTROLLED IF PRINTED. Page No: 23 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0022
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

high overall resilience may be built. The IXOS archive solution deployed for POLFS actually allows users
to query data in the archive in practically real-time which would be impossible with a tape or optical juke
box system.

EDL 4100. Formerly CDL (Clarion Disk Library) but renamed EMC Disk Library as the latest generation
allows a DMX to be used for greater scaleability. A small linux server is attached to the disk array, and
presents virtual tape drives to the SAN to emulate a variety of physical tape libraries. In the HNG-xHNG-
solution we have chosen to emulate a StorageTEK L700 with LTO3 drives, as this both allows a simple
migration path from Horizon which used a real StorageTEK L180 with LTO‘ drives, and allows test
systems to use StorageTEK libraries such as the L40 without modifying the backup application.

TX24 LTO3 Autoloader. A small and very simple tape library with a single LTO3 drive, but capable of
writing to as many as 24 tapes without operator intervention. This is provided only to allow very rare
physical exports of data from the data-centreData Centre. Normally the size of the EDL and Centera
mean that all data is stored within the solution, using the pair of sites for resilience.

The primary storage elements in each data-cenireData Centre consist of:
+ Two EMC Symmetrix DMX3
* One EMC Clariion CX3-80
+ One EMC CeleraCelerra NAS
+ Two EMC Centera CAS systems (one for Audit, one for POLFS Archive)
* One EMC EDL 4100 Virtual Tape Library
‘+ Two Cisco MDS9509 Directors [configured as independent switched fabrics]
‘+ One Fujitsu-Siemens FibreCat TX24 LTOS autoloader.
are hosted on BladeFrame, all data including the boot drive will be hosted on the

For systems wh

IRE++1_and this- does not form_part of the reduced testing capability available following loss of IRE49.

Al significant business data from discrete servers is also stored on the SAN (these servers use local boot
drives).

3.3.1 Storage Design

The storage design for this project is detailed in DES/PPS/HLD/0007 and HNG-XHNG-X SAN High Level
Design (DES/NET/HLD/0007). A brief overview is provided here to aid the reader's understanding

[ SHAPE \* MERGEFORMAT ]

Different VSANs are used to segregate data paths either for performance or security reasons. The
primary separations have been labelled here as VSAN A etc.

Storage is synchronously replicated between the primary and secondary sites using SDRF-SRDF via the
Intercampus link described earlier. Before a write is acknowledged to a server, itis written to the storage
at both the primary and secondary sites. This ensures consistency of data between the primary and
secondary sites.

The Symmetrix DMX disk arrays ate operated as two independent pairs. One will contain the Branch

Database, and the other the Branch Standby DataGuard copy. In the unlikely event that a fault develops
with the storage continuity of service is ensured, as at least one copy of the Branch Database will be

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23123-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 24 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0023
[TITLE \* MERGEFORMAT }
FUJITSU [SUBJECT \" MERGEFORMAT ] @&

available. Note that this also requires the BAL servers and the BranchDB RAC Node boot disks to be
balanced across both DMX's. This will be discussed in more detail in the chapter later for that service.

The CX3-80 Clariion also supports an area of disk for Oracle RMAN backup. This is the primary recovery
point for Branch Database corruption, and online storage is provided to allow recovery to be made in as
timely a manner as possible.

IDN: This has not been adopted for NPS, so we would lose banking while the problem was resolved if
that DMX happened to be the one with a fault.)

3.4 BladeFrame

The Platform & Storage Architecture (ARC/PPS/ARC/0001) and the BladeFrame High Level Design
(DES/PPS/HLD/0025) discusses BladeFrame deployment in detail. A summary is provided here.

BladeFrame from Fujitsu-Siemens consists of a chassis with up to 24 stateless processing blades
(pBlades), two control blades (cBlade) and two switch blades (sBlade) in a cabinet with a foot-print similar
to.a normal server cabinet.

+ A processor blade or pBlade contains processors and memory. There are 24 per chassis (or
frame),

‘+ Asswitch blade or sBlade manages the intemal switching of fibre-channel (storage) and Ethemet
packets and provides V/O to the pBlades via the backplane.

‘+ The control blades or cBlades comprise a cluster presenting the PAN Manager service. This
provides out-of-band management of the server instances (power on, power off, console), and
also controls all the external connections to the network and storage. A single view of resilient
network and storage connections is provided to the pBlades,

+ From PAN Manager 5.1.3 Xen virtualisation is offered natively by BladeFrame, and a hypervisor
may be started on a pBlade to provide a number of virtual blades or vBlades. The amount of
memory used by all the vBlades cannot exceed the physical memory in a pBlade, and this
controls the amount of virtualisation that may be offered. Provided that the virtual CPU threads.
do not exceed the number of physical CPU cores little performance degradation is noticed,

+ AnLPAN is a named set of resources. In the HNG-xHNG-x implementation this has been chosen
to correspond to the Platform Set (also known as Rig). The disks in the SAN have a static
relationship with the LPAN, but other resources such as pBlades may be shared allowing a form
of work load management.

The cBlade runs the PAN Manager software. This software allows logical sub-sets of resources (LPAN)
to be defined and virtualised server instances known as pServers to be associated with resources. This
permits one to specify a set of services in a virtual way, and therefore have an area of storage (which we
know how to replicate using hardware) as the DR mechanism.

Since alll the pBlades are stateless, all data including boot disks are stored on the EMC arrays. All disks
except disks holding swap or temporary information will be replicated to the secondary site.

The PAN Manager service is hosted on the master Blade, and the oBlades operate as a cluster to
provide a resilient PAN Manager on a VIP.

If a pBlade were to fail, the PAN Manager software can automatically start up the pServer on a spare
pBlade in the server's primary pool, or on a pBlade in a designated failover pool with minimal loss of
service (the time for a reboot). It is up to the application to recover. Although tools do exist to allow
application management by PAN Manager these have not been used at HNG-xHING-x.

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \* MERGEFORMAT] pee 23125 JulvMayNov-087

UNCONTROLLED IF PRINTED Page Ne: 25 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0024
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

[DN:-This-tast stat Mt i fe Fa MA
is performing this function inthe HNG-x context}
During normal use, the Control Blades share extemal traffic for the pBlades in a balanced state, allowing

the BladeFrame to harness the capacity of both simultaneously. If an individual cBlade should fail, the
other continues to manage resources for the entire BladeFrame.

A pair Switch Blades (sBlade) are provided to route traffic from the pBlades to the cBlades. At system
startup or pBlade insertion each pBlade makes a data connection to both sBlades, which in turn make a
connection to both cBlades. Egenera uses the term GigaNet for this service, which operates both LAN
and SAN traffic for the pServers on the pBlade. In service these paths are balanced in a least bus:
fashion. Upon failure of either an sBlade or a cBlade the pServer simply retries and uses one of the
remaining paths.

In principle when an sBlade or cBlade has failed the number of data paths, and therefore the throughput,
are halved, but in practice itis rare for all pServers to be operating at maximum capacity. A single cBlade
(LAN and SAN) can pass 16gbps of traffic which is a huge amount, and other infrastructure constraints
are likely to be of more concern.

Due to the exceptional self-managing of this infrastructure component and abstraction of the physical
processors from the storage which holds the boot images and data, BladeFrames are exceptionally well-
suited for use in a site-failover scenario. It is expected that all services will be hosted on BladeFrame
technology except where some feature explicitly inhibits this.

There is an additional benefit that the number of "spare" boxes sitting about goes down from one per
service to one per ten services reducing the overall running cost of the solution.

BladeFrames may operate in “farms” of up to three chassis, allowing a single point of control, and also
allowing services to fail over between frames. One PAN Manager Service is designated as the Frame
Master, and all farm members operate as an extended cluster.

Early experience has shown that Farm Manager failover is not as well behaved as PAN Manager failover,
and that if the chassis with the Farm Manager fails control of the farm is lost until it is restored. This may
be rectified in newer versions of PAN Manager.

This experience, plus physical limitations on the number of LUNs that may be presented to each PAN
Manager (2000) have ruled out the use of farms for HNG-xHNG-x. The loss of the Farm functionality is
not very important to HNG-xHNG-x, but it does introduce a loss of flexibility. Services may still be moved
between frames, but this is a manual process that involves storage reconfiguration.

Highly available services (ike BranchDB) still require special solutions (like Oracle RAC) which
BladeFrame is particularly well suited to host because of the way storage is virtualised. If a oluster is
deployed it is important to make sure that the pBlades used for the cluster are in different power domains
0 that the failure of a power module will only affect one cluster member. This approach is applied more
generally to other physical resources, but it must be recognised that there are so many services in an
individual frame that complete failure of an entire frame will be likely to trigger DR

One feature of this virtualisation of storage is that the pBlade is not SAN aware, and certain agents and
tools from storage vendors will not be able to resolve or manage storage presented to these servers. This
may change in future releases.

The BladeFrame chassis are deployed in pairs, one at each site, with the same physical arrangement of
pBlades in corresponding pairs. Whilst this is not strictly necessary it makes assurance of performance
upon DR more straightforward. Each member of a pair will have the same LPAN definitions, typically a
Production LPAN and one or more Test LPANS. Services within these LPAN may be stopped and started
as required provided external resources such as storage and networks are being presented to the LPAN
at that site.

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: iat tao 087

UNCONTROLLED IF PRINTED. Page No: 26 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0025
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

‘The Power Modules each have two 3-phase supplies and supply power to 6 chassis slots for Blades.
They are labelled A B C & D. A also supplies cBlade 1 and switchBlade 1, and B supplies cBlade 2 and
‘switchBlade 2.

Jf the PIM partially fails (one line in) but it does still work on the second line, we still have to schedule a _

PIM replacement and that means taking 6 PBLADES and possibly, 1 control blade and 1 switch blade out

of action while the PIM is replaced.
It the PIM fails causing a total loss pf power on a power domain it will result in,6 PBLADE crashes (in a

{ully populated frame) and single control blade and switch blade failure if it is in the A or B power

POL00397094

domains.

While a PIM is off-line there is not enough spare resource to restart all the failed services on the
remaining pBlades, and services have been distributed in each BladeFrame and across BladeFrames
such that PIM failure causes loss of resilience not loss of service (see DES/GEN/SPE/0007 for detailed
layouts)

Loss of an entire chassis is unlikely. Depending on which chassis is lost an assessment would need to be
made of whether to initiate DR immediately or whether enough service was being offered to delay the DR
until outside the normal operational day.

3.5 Network

The network approach to resilience is defined in Network Technical Architecture (ARC/NET/ARC/0001)
and developed for each major subsystem in high level designs: the Data-CenireData Centre LAN Design
(DES/NET/HLD/0004), the Wide Area Network Design (DES/NET/HLD/0009), the Branch Access HLD
(DESINET/HLD/0014), and the Transit LAN Design (DES/NET/HLD/0015) which presents models for
‘connecting to third parties such as the financial institutions.

The network is based on four core Cisco 6513 switches (two per site) and four Access 6513 switches
with resilient ASA5540 based firewalls between the Core and Access layers. Each switch is intemally
highly redundant, and is connected via ISL to its partner. Each core switch will also have a firewall
module and an ACE (Application Control Engine) module,

3.5.1 Cisco VRRP

The router uses Cisco VRRP to provide a virtualised routing service. If one of the 6513 switches fails,
then the other switch will take over as being the next hop gateway for each VLAN that it supports.

3.5.2 Cisco ACE (Application Control Engine)
Cisco ACE module is a load balancing module which fits within the Cisco 6513 chassis.

The module monitors services on the network that are in an N+1 configuration. If a client uses one of
these services then ACE will be used to direct the client to the least loaded instance, or in the case of
active/standby systems to the active instance.

Configurable probes allow the ACE module to check an application is available rather than just the server
is available ~ so in addition to simple “ping” monitoring, scripts can be configured to check the health of a
particular web application hosted by a server, and remove the server from the load-balancing pool if it
retums an incorrect response. The application designer is expected to discuss how ACE is used and the
detection strategy in the network section of the application HLD.

Inter-chassis: An ACE in one Cisco Catalyst 6513 is protected by an ACE in a peer Cisco Catalyst 6513.
Examples of systems monitored by ACE:

Copyright Fujtsu Series id TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number” \*
MERGEFORMAT ]
Version: 0.432
. Date: 23423-JulyMayNov-087
UNCONTROLLED IF PRINTED TKEYWORDS V"MERGEFORMAT] —— PegeNo: 27 of 126

—

POL00397094
(Formatted: Font color Auto J
~ (Formatted: Font color Auto )
— (Formatted: Font color: Auto }

(For Font color: Auto J
~( Formatted: Fant color Auto }

POL-BSFF-0223764_0026
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Branch access layer (Interstage)
Other web services
POLFS Production instance (PLP)

The list is not exhaustive, but there is a service-by-service overview later in the document that will
describe whether a system uses ACE, and if so whether it uses ACE load balancing or some internal load
balancing

There are other mechanisms for load balancing, for example using the Oracle TNS‘Listener service, and
it is up to the application designer to choose the most appropriate technology in terms of performance
and resilience. Choosing ACE at least means that the application designer can hand off the operation of
resilience to the network but the time taken to detect and advertise may not be suitable in all cases.

3.5.3. Network Management Systems

HP OpenView, Cisco Works and a number of diagnostic probes are located in each data-contreData
Centre. The network is treated for management purposes as entirely Production, with isolated areas of
testing. This is to ensure that the secondary site network is always ready to act as the DR target.

HP OpenView gathers sampSNMP events from network equipment, filters the events, and forwards them
to Tivoli (SYSMAN). OpenView also actively probes for managed devices, such as servers, and raises an
alert if the server cannot be contacted,

Alarm Point is used to forward pager alerts for critical failures to ensure that a “flashing light" is less likel
to be overlooked. This is especially important when a critical alert is received during a period when a
number of less severe alerts are being dealt with. Alarm Point will also be used by the SYSMAN systems
to raise alerts,

Loss of these systems is analogous to losing the instruments in a car. It does not put the system into
immediate jeopardy, but there is a danger that a warning will be missed. If an event has occurred, then
the unavailability of these systems means that diagnosis will be slowed down considerably.

The systems, collectively known as NMS, are described in the Network Management System HLD
(DESINET/HLD/0012), and their migration is covered in the migration design (DES/MIG/HLD/0001).

3.5.4 Directory Services

Directory services are provided by Windows Active Directory. AD will be deployed in accordance with
best practices for security and resilience to provide a continuous service in the event of disaster. This is
described in the Active Directory High Level Design (DES/PPS/HLD/0003). Interface modules are
provided to allow Solaris and Linux systems to interact with AD.

Secondary authentication is integrated with AD and managed transparently for applications. This is
described in Strong Authentication High Level Design (DES/SEC/HLD/0001).

Although at first sight these are not very critical services, their non-availability may prevent staff from
accessing the estate to effect a timely repair. It is therefore essential that under all failure circumstances
where it is possible to offer a business service that a working authentication mechanism is available.

‘AD generally operates as a peer-peer service, but has a number of "Flexible Single Master Operations"
or FSMO Roles. These roles are not all offered by the same server, and tend to be spread out amongst
the members. The roles are only required to make changes to the domain, but such changes include
adding members and password resets, so any aspect of DR that requires such an operation may need to
wait for AD FSMO Role transfer to complete, and no critical services should depend on this.

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \* MERGEFORMAT] pee 23125-JulyMayNov-087

UNCONTROLLED IF PRINTED. Page No: 28 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0027
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

3.5.5 DNS

A primary and secondary DNS service is provided based on dedicated linux servers. DNS is designed for
resilience across sites, and therefore the resilience model is also the DR model. The Windows Active
Directory domain controller infrastructure is also a secondary DNS server, and provides DNS services to
the Windows platforms in the estate. The design of DNS is covered in the Domain Naming System High
Level Design (DES/NET/HLD/0006).

‘As with AD there is a "master" for making updates, and a number of slaves offering the look-up service.
Critical services should not rely on-DNS updating DNS entrieses as part of the DR process as the master
may not be available at the time of failover.

3.5.6 Time Synchronisation

There will be four dedicated time servers, two at each data-centreData Centre. Like AD and DNS, time
synchronisation is designed to be inherently resilient and the terms resilience and DR are synonymous,
The design of time synchronisation is covered in the Time Synchronisation High Level Design
(DESINET/HLD/0013)..

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
Date: Pilz Juvilaov 087

UNCONTROLLED IF PRINTED TKEYWORDS V"MERGEFORMAT] —— PogeNo: 29 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0028
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

3.6 Oracle

3.6.1 Oracle RAC
‘The Branch Database and the NPS database both use Oracle RAC to provide high availability.

This configuration provides load balancing of client requests during normal operation. Additionally, if one

node in the RAC cluster fails the other nodes will take over the load.

Therefore, the system capacity is managed in an N+1 configuration, such that the RAC cluster can
handle peak load even with one failed node.

Branch Database N=3 (four nodes normally active)
NPS N=1 (two nodes normally active).

For Branch DB each branch will normally access the same node, if a node is unavailable then the failed
node's branches will be spread across the remaining nodes.

NPS supports several services e.g. ETS, DCS, NWBNBS as if they were separate applications. A branch
‘connection will go to the same node for any one of these services, but the branch will not necessarily use
the same node for all services.

The mechanism for ensuring persistent connections and load balancing and for managing failed nodes is
described in detail in the Branch Database HLD (APP/ARC/HLD/0020), with high level context in the
‘Online Services Architecture (ARC/APP/ARC/0005) and the Branch Database Architecture
(ARC/APP/ARC/0008)

3.6.2 DataGuard

Oracle DataGuard is an Oracle feature which allows one or more standby databases to be maintained in
a transactionally consistent fashion to a primary database. This is achieved by applying changes from the
primary database to a secondary copy of the database.

In the implementation for HNG-XHNG-X, the production Branch Database will be the primary database,
and a secondary copy of this database will be maintained at the same site. The only purpose of this
secondary database is to guard against physical corruption of the primary production database, ie. some
form of unrecoverable V/O error resulting in a loss of database integrity.

DataGuard allows for control of lag target in a number of ways, including fully synchronous. Because of
the manner in which the changes are propagated as application level applied changes it is very unlikely
that a data corruption will be propagated, and in the event of the primary being unavailable due to a
corrupt block failover to the standby is extremely fast. The primary can then be repaired

Whether the database is operated flip-flop or by failing back is a design choice.

3.6.3 Streams

Oracle Streams is used to propagate data from a source database, often a high rate of change
transaction processing system, to a target database with a different structure, often a reporting or data
warehouse type system. This allows large queries to be run in near real-time on the target database
without impacting the performance of the source database.

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: Bist tao 087

UNCONTROLLED IF PRINTED. Page No: 300f 126

POL00397094
POL00397094

POL-BSFF-0223764_0029
[TITLE \* MERGEFORMAT }
FUJITSU [SUBJECT \" MERGEFORMAT ] @&

In HNG-xHNG-x the Branch Database will use Streams to send data to the Branch Support database.
This allows the SSC to perform diagnostic queries without connecting to the Branch Database, and also
allows batch SLA reporting to avoid the Branch Database.

The consequence for resilience and DR is that a DataGuard failover to the Branch Standby must
continue to replicate via Streams to the Branch Support database.

3.7 Site Consistency

To enable successful site failover to occur, both sites need to be reasonably consistent. That is to say,
the amount of data loss due to a site failover needs to be minimised. The major area where this may
‘cour is in the SAN replication.

Processes need to be in place to ensure that any change that happens on the primary site also occurs on
the secondary site. The “lag” must be carefully monitored, and alerts should be raised in case the lag
grows beyond an acceptable threshold. This is generally an alert which is raised by the storage system
itselt

At the time of writing no applications have elected to operate asynchronous storage replication.
Replication is either synchronous using the EMC SRDF or MirrorView storage replication mechanisms, or
is managed by the application.

3.7.1 Storage

A_number of classes of storage are proposed in the Platform & Storage Architecture
(ARC/PPS/ARC/0001). Applications which require zero data loss on failover must specify suitable
storage, which in this case is EMC DMX with synchronous SRDF enabled. Applications which can
recover following a failover where some amount of data has been lost, either because the loss of the data
is not significant or because the recovery may be effected from journals or upstream systems, may use a
lower storage class.

Storage presentation is very important. LUNs that are replicated between sites should have the same
SCSI IDs so that the BladeFrame at the failover site can recognise the storage without need for
reconfiguration. Additionally device ID's within the storage array should correspond between sites for
replicated devices. There are a number of mechanisms for coping with asymmetry, but they all require
considerable development and testing effort, and are confusing for support staff. To achieve the goal of
operating an efficient and cost effective service such asymmetries are denigrated,

In addition, to allow the workload of any pBlade to be run from any BladeFrame, each LUN that is
required by any pServer will be presented to the oBlades in every BladeFrame. There is a limit to the
maximum number of LUNs that may be presented to an individual frame at any one time, and this will be
controlled by LUN Masking.

The process of changing LUN Masking is somewhat tedious and error-prone. It will therefore be scripted
As far as possible DR will not require changes to LUN Masking. If this becomes unavoidable it must be
recognised that the process of changing the masking is time consuming and subsequent steps in the DR
must not be allowed to proceed until the masking has been confirmed as having been changed
successfully.

3.7.2 Network

All VLANs that are required by any pServer will be trunked to all of the BladeFrames. Internally the
pServer instances connect to vSwitches. The vSwitches will be associated with and VLAN ID and named

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \* MERGEFORMAT] pee 23425-JulvMayNov-087

UNCONTROLLED IF PRINTED. Page Ne: 31 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0030
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Icrig>pbfsNNNNvSwitch_XXX where NNNN is forthe ity Di Jnichsthe
VLAN ID and rig is the two letter identifier, e.g. pr for production resides in

The presentation of vSwitches to LPANs prevents test systems and production systems from accidentally
sharing the same VLAN, and the naming convention assists operators in identifying production switches.

The association of VLAN ID and Security Domain is enumerated on a tig by rig basis in the Data Centre
LAN DesignHNG-x VLAN Mappings (DEV/INF/IONLLD/000244).

There is a distinction to be made between server instances which are BladeFrame based and have a
production/test mode of failover, and those server instances which are active/active ot active/standby,
which may include both BladeFrame and discrete servers.

‘Server instances which use the production/test mode of failover will appear on the secondary site with the
same IP as they had on the primary site. These servers will use 172.18.0.0 subnets. In order to prevent
confusion these subnets will normally be inactive at the secondary site. During DR they will be disabled at
the primary site and enabled at the secondary site. To simplify this process Cisco Works scripted tasks
will be made available for Network Operations staff to use,

Note that although POLFS internally considers itself Production/Test, from the point of view of service
delivery (See Section 10.1.8) it is treated as active/active, as the "Test" systems are used by external
customers. It is thus possible for the POLFS Production service to fail over without any impact on the rest
of the Data Centre. As now, network load balancing (CSM in Horizon, ACE in HNG-) will advertise a VIP
to the external customer for the POLFS service,

[ON Wilt POLES which is-discrete-and- ductiontest]

As far as possible one pair of frames will be dedicated to hosting active/active services such as AD, DNS,
and SAS which are required to support the initiation of disaster recover

Server instances which use the active/active or active/standby model must be present at both sites
simultaneously, and will use 172.16.0.0 subnets for IRE11 and 172.17.0.0 subnets for IRE19. Services
on these servers are likely to require ACE to present a service VIP to clients.

A Security Domain which contains server instances of both types will require a minimum of four VLAN
IDs, two per site.

Copyright Fujtsu Services Lia TSUBJEGT ¥ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \\MERGEFORMAT] — ate! Bs utlewev- 057

UNCONTROLLED IF PRINTED. Page No: 32 0f 126

POL00397094
POL00397094

_( Formatted: Fort Tale

POL-BSFF-0223764_0031
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

4  Dev/Test

In order to provide limited testing capability in the event of a Data Centre DR, it is planned to take copies
of the LST test rig (from IRE19) and store these in IRE11. The same DR approach can be used

imespective if IRE11 or IRE19 fails. Apart from ensuring that the LST data is available, no additional

design work is needed in advance,

In the event of a DR event, once the production system has been established, an LST configuration will

need to be determined sufficient to test specific fixes; this confiquration may be able to run within the
‘spare production capacity. In some cases it may be necessary to reduce production capacity in order to

accommodate LST testing, for example outside normal PO hours when online systems can safely be run

with reduced resilience.

In order to define limitation of scope two sections from Schedule B3.3 are reproduced here for
‘convenience.

B3.3 Section 12.6 __ The HNG-x Services shall use the facilities of the DR Data Centre to
provide a testing environment, which shall be able to concurrently

‘support as a minimum either:
(a) _two_small_testing configuration for_performance and volume

testing (which may support up to 50 per cent of the business
volumes as set out_in the CCD entitled "Horizon Capacity
Management and Business Volumes _(PA/PER/033)) and
resilience testing of platforms and applications; and

two small te:

1g configurations for functional testing,

single testing configuration only to support performance and
volume testing (which may support up to 100 per cent. of the
business volumes as set out in the CCD entitled "Horizon
Capacity Management _and Business Volumes (PA/PER/O33)
and resilience testing of network components.

Any additional test configurations that are required to support changes to
Post Office's business shall be dealt with through Change Control
Procedure.

B3.3 Section 3.3.1 (q)_In the event that the DR Data Centre needs to be used to run the live
service or if the DR Data Centre itself is unavailable, there will be no
significant test environment available. In this scenario, limited testing
(sufficient to test minor fixes needed to keep the live service operational)
will be available at a Fujitsu Services development site. However such
testing facilities will not be sufficient to test releases.

In normal use the servers located on the secondary siteDR Data Centre will be used for software testing.
In the event of having to perform site failover, testing would cease and the servers will be re-configured
for live running.

In order to allow best use of the physical resources available the Platform & Storage Architecture

(DES/PPS/ARC/0001) mandates that all pServer instances will be hosted on vBlades except where some
specific requirement, e.g. use of 64 bit operating system, makes this impossible. This allows Test to

Copyright Fujitsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
Date: Bais JulaNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 33 0f 126

POL00397094
POL00397094

Formatted: Justified, Space After: 6 pt, Adjust space between
Latin and Asian text, Adjust space between Asian text and
numbers

POL-BSFF-0223764_0032
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

overload pServers onto pBlades with the same configuration and build as live except for the amount of
memory & CPU offered.

Horizon platforms use NT4SP6A which it is not possible to run on many modern systems owing to lack of
suitable hardware drivers. NT4SP6A systems will either be run on like-for-like hardware sourced from
equipment brokers, or where possible hosted on Microsoft Virtual Server Host in the BladeFrame. As
\VSH instances already offer virtualisation these will be run on pBlades rather than vBlades.

Itis possible to present both Xen and MS-VSH virtualisation extemal to the BladeFrame for example on
FSC Primergy RX300 type servers. Provided no SAN storage is required this is relatively uncomplicated,
but these systems are not as straightforward to manage.

fe-the IRE: lable,_then-testi iw placie-0t0 ch Fuji

POL00397094
POL00397094

Devel ito—Th ehavet facilities at IRE14

The BladeFrames, console management systems, network & SAN switches, routers and storage arrays
at the secondary site are all considered live production equipment and any management interfaces are
‘connected to production networks not test ones. This is to ensure that readiness to fail over is never
‘compromised.

Itis-expested-that-semeSome components of test systems will exist at IRE11 in order to test cross-
‘campus features,_as_shown in ARC/SOL/ARC/0001 Section 6.2. In order to reduce the cost of
maintaining representative development environments these systems may also be used for production
‘support to develop fixes for example for backup and DR. This will be under the control and management
of the LST Test Mmanager.

Test services (known as “rigs") may be shut down when not in use. The only requirement is that they be
‘safe during a period of DR or DR testing, and that they can be restarted when required. This will be
managed by controlled shutdown of test services during any DR, managed by the business continuit
plan. The start-up is exactly equivalent to a failback of the production service.

In order to meet the requirement of Schedule B3.3 Section 3.3.1 (q) the data that provides the main pre-
production rig, known as Live System Test or LST, will be periodically copied to IRE11 storage under the
direction of the LST Test Manager. This will use a backup system already being used to preserve "start of
cycle” images for the System Test Rig. In the even of running from a single Data Centre there are
sufficient pBlades defined in the Production service in DES/GEN/SPE/0007 v4.4 to enable an appropriate
LST _LPAN to be defined. If necessary testing will be performed at_a time when more resources are
available, e.g. outside core hours.

ACP has been raised to enable the alternate site for LST Testers to operate concurrently with BRAOt
Any failover of LST will be transparent to the site the testers are located at.

In order to minimise on-going support costs the operation of LST in this way is considered an emergency

process and will not normally be exercised. Any limitations to operating LST in this way will be overcome
‘operationally in the event of having to invoke the process, but in fact the process is similar to the
procedure for moving services between BladeFrames, and the risk is not considered to be great.

Copyright Fujtsu Services Lia TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23423-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 34 0f 126

(Formatted: Font tale )

(Formatted: Font Nota

POL-BSFF-0223764_0033
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

5 Server classes

There are different classes of server. These are described in the Platforms & Storage Architecture
(ARCIPPS/ARC/0001).

‘+ BladeFrame, which is SAN-attached

‘+ Discrete Windows 2003 R2 servers Fujitsu-Siemens Primergy platforms (principally RX300)

‘+ Discrete Linux RHEL 4.0 servers on Fujitsu-Siemens Primergy platforms (principally RX300)

‘+ Discrete Solaris 9 Servers on various Fujitsu-Siemens PrimePower platforms (principally PW450)

+ Discrete Solaris 10 Servers on various platforms (principally Fujitsu-Siemens PrimePower
PW250, PWES0 or Sun SunFire V125)

All of these servers are "data-centeData Centre class" systems and offer substantial internal resilience
such as mirrored boot disks, multiple network interfaces and N+1 power supplies.

Where application resilience requirements permit they may be deployed in a less resilient mode, for
example with a single boot disk, but in practice the savings achieved in deployment are usually lost the
first time a system rebuild is required as a result of a failure.

5.1.1 Production/Test BladeFrame

There are three production BladeFrames at the primary site running the Production LPAN. This service
will only be available at one site at a time. The BladeFrames at the secondary site will be used for Test
LPANs during normal operation. In the event of failover the Test LPANs will be shut down and the
Production LPAN started at the secondary site.

‘The PAN Manager service will at all times be in a Production state to facilitate failover, and therefore the
Blade connections will be to the Management VLAN at each site.

Test provisioning systems that need to communicate with the PAN Manager Service will do so through a
firewall using an LPAN Administrator role for the test service they are provisioning.

The detailed design of the LPANs is described in the BladeFrame HLD (DES/PPS/HLD/0025).

5.1.2 Active/Active BladeFrame

There is one active / active BladeFrame pair which will host services which are required at all times at
both date-centeData Centres. These may need to be available to support site failover, or the application
model of resilience and DR may be implicitly the same.

There is no equivalent of this in Test, and the test rig design must make due allowance. Many of these
services are active/active in order to provide resilience which is not part of the requirement for that test
rig, and the extra systems may be omitted unless resilience is a specific part of the test.

Examples of such services include domain controllers and DNS servers, and also elements of the Hydra
Branch services.

5.2 Without BladeFrame

The Platforms and Storage Architecture mandates that a platform is only permitted to be outside
BladeFrame if:

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \* MERGEFORMAT] pee 2B25-JulvMayNov-087

UNCONTROLLED IF PRINTED. Page No: 35 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0034
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

‘+ Ithas some hardware that is not provided by standard BladeFrame e.g. serial cards in Aurora
server

‘+ Itis not based on Intel architecture e.g. SPARC Solaris
‘+ Itrequires direct SAN attachment e.g. systems with SYMCLI
‘+ Some other business justification exists

These services will be run with one instance in a discrete server in each data-centreData Centre. This is
undesirable, as these servers are difficult to move from a Test domain to a Production domain, and they
tend to multiply rapidly which has a detrimental impact on running costs and system complexity.

These systems are inherently more prone to outage due to component failure, and the "server explosion"
may be further exacerbated by the need for a local N+1 resilience model. Even for active/active systems
due consideration must be taken of the need for continued resilience and service availability AFTER the
loss of a site.

Note that although NT4 systems are not supported directly by eGenera, the use of virtualisation permits,
their hosting in the BladeFrame, e.g. on a VSH platform.

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: iat jute on?

UNCONTROLLED IF PRINTED. Page No: 36-0f 126

POL-BSFF-0223764_0035
[TITLE \* MERGEFORMAT }
FUJITSU [SUBJECT \" MERGEFORMAT ] @&

6 Availabi

'y/Resilience

6.1 Service Level Targets

The migration agreed assumptions and constraints strategy REQ/CUS/STGI/0001 defines relaxations to
DR requirements during migration.

There are a number of availability requirements presented in Service Level Agreements between the
customer and the account. There are also a number of contract schedules. Typically the contract
schedules should only reflect the penalties for missing a target and not define targets themselves.

‘An overall view of the availability requirement is presented in the Systems Quality Architecture
(/ARCIPER/ARC/0001)..

6.2 Design Criteria
Resilience targets are met by the application design.
This section lists and clarifies some principles when designing for resilience.

6.2.1 Application Design

Host Applications Database Design and Interface Standards (DES/GEN/STD/0001) lays out standards
for application designers that direct them towards delivering an application that is inherently recoverable,
and that conforms to standard mechanisms for raising alerts to the estate monitoring system.

The effect of component failures, the alerts raised and the recovery mechanism should be detailed in the
Service Resilience and Recovery Catalogue (SRRC).

For database applications the effect of each object in a database becoming unreadable and the
mechanism for recovery should also be detailed.

The failure of upstream and downstream systems, any alert raised and its effect on service should also
be considered in the design and covered in the Application Support Guide. Manual intervention should
not normally be required to recover from such failures.

Many services are designed to cope with the loads of a peak transactional day, and the effects of failures
at other times may not be as severe. The SRRC should provide guidance on whether graceful
degradation in performance or actual loss of service is expected as the result of a failure.

6.2.2 Alerting and event management
Need to review DES/APP/HLD/0007 Host Applications Monitoring

Component failures will result in events being raised. A single component e.g. a network interface, may
result in a number of events being raised independently, e.g. the network switch and the server may both
raise an event.

The System and Estate Management: Monitoring Architecture (ARC/SYM/ARC/0003) describes how
events are gathered and analysed.

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \* MERGEFORMAT] pee 23125-JulvMayNov-087

UNCONTROLLED IF PRINTED Page Ne: 37 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0036
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

The events raised must be identified in the SRRC to enable the event filtering system design to present a
relevant interpretation of the event to the Systems Management Centre, to assign an appropriate priority
to dealing with the event, and to direct the appropriate support teams to respond to the event.

Realtime Active Dashboard (RAD) is designed to filter these events and provide a consolidated business
view. The model for this is given in ARC/GEN/STD/0001.

6.2.3 Reduce Single Points of Failure (SPOFs)
Single points of failure will be eliminated where possible.
Examples of remaining SPOF and their mitigation include:
- _ BladeFrame chassis failure — Internally each frame is highly resilient. The chassis is just the
steel structure holding the individual components plus the backplane which is just copper.

Excluding deliberate physical attack and events which destroy the data-centreData Centre
these components are unlikely to fail.

- Storage arrays — Internally each array highly resilient. Business critical systems are
replicated to at least two physically separate arrays.

- Branch router — this causes the loss of only one branch

- Counter — this causes the loss of only one trading counter

- C&W link to data—centreData Centre — network triangulation via the secondary data

centreData Centre C&W and inter-site link will ensure the service will continue if the C&W to
the primary data contreData Centre fails.

= _IRE11-TH1 has a single generator. The likelihood of this failing whilst there is also a power+—{ Formatted: Bullets and Numbering )
cut is mitigated by reqular testing and servicing. Only services which are non-critical or able
to fail over to IRE19 independently will be located in TH1 unfil the site is uparaded by the
‘commissioning of the old TH2 generator.

= IRE19 is only supplied by a single sub-station. This is mitigated by the ability to run for an+——{ Formatted: Indent: Left: 1.27 em, Bulleted + Level: 1 +
‘extended period on generator. ‘Aligned at: 1.9 cm + Tab after: 2.54 cm + Indent at: 2.54

com, Tab stops:_1.9 om, List tab + Not at_2.54 om

‘A number of systems are not fully N+1 resilient in the event of loss of the primary daia-cenireData Centre.
Examples include HP OpenView where the licensing cost precludes having four systems just to provide
full N#1 resilience, and the POLFS Central Instance server which is being migrated "as is" from Bootle &
Wigan.

‘A number of other systems, e.g. the EMC Remote Gateway server and Supplier Access Server do not
have local N+1 resilience as there is adequate time to rebuild these systems should they fail.

IDN: Should there be a schedule in the SVM/SDM document set that details these items? Basic process
is to identify areas and bring to attention of risk manager]

6.2.4 System redundancy

All systems willbe fully redundant internally.
The BladeFrame already has these capabilities.
Discrete servers will have the following:

‘+ Dual HBAs (independent cards, as opposed to dual-port cards), attached to two different Director
switches, if SAN attached

‘Dual NICs (independent cards, as opposed to dual-port cards)

Copyright Fultsu Services Lid [SUBJECT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: iat uta 087

UNCONTROLLED IF PRINTED. Page No: 38 of 126

POL-BSFF-0223764_0037
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

‘+ Disks setup in a redundant fashion with either RAID-5 or RAID-1
+ Redundant power supplies

In Horizon many servers were deployed that did not comply with this approach because the overall
solution deployment could cope with the failure of a single server. On balance experience has shown that
the support cost is lower if more resilient servers are deployed, especially where this reduces overnight
call out.

6.2.5 Cluster

Clusters are suitable for systems that require very high availability. Such systems include the Branch
Database and Network Persistent Store. Clusters are typically very complex to manage and generally
expensive to implement.

Use of BladeFrame and Oracle RAC has used frameworks that make cluster deployment relatively
straightforward, but the resilience of agents and interface layers such as the Banking Authorisation
Agents and Branch Access Layer still demands specialist design expertise.

It should be noted that a cluster is simply a shared database, and whilst this may provide for a very high
availability in the event of server failure it provides no protection in the event of deliberate or accidental
data corruption.

6.2.6 Redundant systems

Externally this is provided by two or more systems which all offer the same service, either as peers or
managed by Cisco ACE or similar load balancing.

A similar effect may be achieved by placing servers in the BladeFrame where the failure of the processor
‘or memory results in what appears to be a simple reboot as the server fails over onto a new pBlade.

This is not suitable for protecting against events such as boot disk corruption, but as the pServers all boot
from SAN it is possible to provide clone images for quick recovery (less than ten minutes).

6.2.7. Warm standby
Warm standby is suitable for systems requiring failover in between 30 minutes and 2 hours

The standby server will be placed into a standby mode, for example Solaris can stop at run level 2, where
itis ready to take over the service, but the application disks are not mounted and the applications are not
started. If the primary host were to fail, a script would be run to mount the disk(s), start the application(s)
and present any service addresses.

6.3 Server placement within the BladeFrame

Ideally for high availability as many servers as possible within an N+1 configuration should be placed in
different BladeFrames to spread the load in case of a BladeFrame failure.

However, the internal network switching of the BladeFrame is much more performant than inter
BladeFrame. The Oracle RAC databases for the Branch Database need to have a very low latency
interconnect to run with acceptable performance, and this may be achieved by having all the Oracle RAC
branch database servers in the same BladeFrame chassis.

SCopyright Fujisu Senieces Lid [SUBJECT ¥ WERGEFORMATT Rae TDOCPROPERTY
20087 "Document Number” \*
MERGEFORMAT]
Version: 0.432
[KEYWORDS \'MERGEFORMAT] Date: 23 23-JulitayNow-087

UNCONTROLLED IF PRINTED. Page No: 39.0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0038
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Although this appears to be less resilient as the loss of the BladeFrame chassis would affect all
instances, the BladeFrame chassis is highly resilient, and in any case the loss of an entire chassis would
be likely to trigger DR.

The BladeFrame has four power modules or PIMs labelled A through D. Each of these powers a different
set of six pBlade slots, and the A and B PIMs supply cBlade1_& sBlade1 and cBlade2_& sBlade2
respectively. Services which reside within the same frame should consider the effect of PIM failure on the
service, for example cluster members should reside in different power domains to avoid widespread
disruption of service upon PIM failure.

The failure of LAN and SAN connections is handled automatically by PAN Manager. In the event of
multiple failures or failure of a cBlade followed by subsequent failure of a LAN or SAN interface in the
remaining cBlade service may degrade to the point where DR is required.

The Platform Hardware Instance List DEV/GEN/SPE/0007 provides a detailed view of the layout and ma’
be used for example to determine the effect of a PIM failure.

6.4 Storage

6.4.1 Storage Arrays
Both the Symmetrix and Clariion storage arrays have a high level of internal resilience.

Storage arrays are provided with power from separate data-ceniraData Centre power supplies, and these
are themselves supplied through independent uninterruptible power supplies and separate breaker (fuse)
panels. Internally the storage arrays have many features the mean that failure of a single component is
Unlikely to affect the ability of the array to continue offering a service. Where application resilience
demands it redundant data paths are provided to the disk volumes through separate mirrored SAN
fabrics.

Within individual disk arrays RAID is used to ensure data integrity in spite of the loss of a disk drive
(either a RAID-1 mirror or RAID-5 parity stripe).

The disk arrays also allow point-in-time copies of data to be maintained as snapshots or cloned copies.
These enable rapid recovery from corruption, but the management of making the clone copy and also
recovering from itis the responsibility of the application designer.

Storage design is covered in detail in DES/PPS/HLD/0007 and DEV/INF/LD/0004. The actual mapping
of storage to platform instances is in DEV/INF/LLD/0043.

The majority of use of clones is through the backup solution as discussed in the Backup & Recovery High
Level Design (DES/SYM/HLD/0015).

6.4.2 SAN Fabric

The SAN Fabric is built around two fibre-channel switches (directors) at each data centreData Centre,
The two switches are independent, that is, they form two separate SAN fabrics.

This provides at least two forms of redundancy — firstly, the two fabrics allow for failure of any element in
any one fabric (assuming both server and storage are connected to both fabrics). Secondly, a (human)
error during a configuration change on one fabric will not affect the other independent fabric, typically
allowing the change to be corrected before any adverse results are encountered.

Details of the SAN configuration are presented in HNG-XHNG-X —SAN Design & Patching Schedule
Spreadsheet (DEV/INF/ID/0003). SAN High Level Design is in DES/NET/HLD/0007.

SCopyright Fujteu Services Ud TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: 25ieh sua 0n7

UNCONTROLLED IF PRINTED. Page No: 40 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0039
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

6.4.3 Host Systems

All systems that connect to the storage arrays will have two HBAs, each HBA will connect to a different
FC director, and therefore to a separate fabric. Multi-pathing is managed either by the control blade:
the BladeFrame, or by host based multi-pathing for the relevant platform. This may be Solaris leadville,
Symantec (Veritas) Dynamic Multi Pathing (dmp) or EMC PowerPath. The choice for each platform
foundation will be detailed in the platform foundation high level designs, Windows 2003
DES/PPS/HLD/0001, Red Hat Enterprise Linux (DES/PPS/HLD/0002) or Solaris 10
(DES/PPS/HLD/0012). There are some historical Solaris 9 platforms. These will use dmp.

Ifone of the Storage ports, HBAs or FC directors fails or if there is a cabling problem, it will not cause the
server to lose its connection to the storage. Connectivity should be designed so that this will also not
affect performance,

BladeFrame may either allow many paths to a single device, or the paths may be grouped-searegated so
that many more devices may be presented. If there are four path-groups (the maximum, each with one
Blade HBA) then 1024 devices may be presented, 256 to each path group. Note that if four path groups
are defined there is no resilience to dual failure, and this should normally only be configured for services
that are not business critical

The BladeFrame High Level Design (DES/PPS/HLD/0025) will detail the setting up of path-groupsstorage
connectivity.

Note that the BladeFrame uses a “least busy path” algorithm. Presenting too many paths to a disk will
have a detrimental effect on performance. The optimum is 2, 4 or 8 paths.

6.5 Network

All network components are deployed in pairs at each data-centreData Centre. The only exception as
outlined in Section 6.3 is the C&W link which has a single CE router and single HO router as having pairs
would not substantially improve service availabili
Every discrete server that connects to the network will have at least two NICs. Each NIC will connect to a
different network switch. The NICs will be configured in an Active/Passive configuration (not load
balanced) with Switch 1 as the preferred switch.

‘The Platform Foundation for each discrete platform type, including the BladeFrame HLD will state how
this is implemented for each platform. Applications may have special requirements, and these will be
covered in the HLD and highlighted in the PPD for such platforms.

Some appliances may only provide one NIC. Where such appliances are business critical there will be

adequate performance available from the remaining services following the failure of a single catalyst
switch.

Horizon systems will be migrated as is, and no uparades will be made. As the majority are being hosted
‘on MS Virtual Server Host in the BladeFrame this is not relevant to most of the migrated Horizon Branch
‘systems (Hydra).

Level De: Patching Schedule-which--don'-t earl -

6.5.1 Basic Topology

This section is only attempting to provide an overview. For further details the Network Technical
Architecture (ARC/NET/ARC/0001) should be referred to.

Copyright Fujtsu Services Lia TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \\MERGEFORMAT] — ate! iat Juan?

UNCONTROLLED IF PRINTED. Page Ne: 410f 126

POL00397094
POL00397094

POL-BSFF-0223764_0040
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

The network is split into major subsystems which are defined in Data-CenteData Centre LAN Design
(DES/NET/HLD/0004), Wide Area Network Design (DES/NET/HLD/0009), Branch Access HLD
(DES/NET/HLD/0014), and Transit LAN Design (DES/NET/HLD/0015) which presents models for
connecting to third parties such as the financial institutions.

For details of the Data Centre LAN implementation please see DEV/INF/LLD/0041 which shows the
Various routing protocols in use and the interfaces to Transit and Branch LANs.

The figure below gives a very high level view of the switch connectivity that forms the basis of providing a
resilient data centreData Centre network service with DR capability. The inter-site links are leased dark
fibre from Virgin (NTL) with an FTEL service. The links do not share any single point of failure, and there
is a minimum 10m component separation at all points along the route.

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: 25ieh stan?

UNCONTROLLED IF PRINTED. Page No: 42 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0041
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ] 2
FUJITSU [SUBJECT \* MERGEFORMAT ]

Ireland 11 Ireland 19

Reema DR OT
-

NN ot B 1)
ae tee maine iy
SPAS Se Stes SA Swen SB

Noe wa
ore en
oe os

soe i.
wom wom
I woe (alae bce , Iv:
PSouh South

‘Access Access ‘Access Access
rmttlayer mutlayer mutayer ine
Sitch PB Switch SA,

‘BCopyight Fujisu Services Lia TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number” \*
MERGEFORMAT ]
Version: 0.432
. Date: 23423-JulvMayNow-087
UNCONTROLLED IF PRINTED TKEYWORDS VV MERGEFORMAT] Page No: 43 of 126

POL-BSFF-0223764_0042
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Connections from outside the data-centreData Centre are via dedicated Cable & Wireless 155 mbps
circuits, one to each data centreData Centre, Resilience of client connections is provided via the inter-site
link.

The segregation is such that a DR Business Continuity Test should only need to consider failover of the
‘core switch layer. The Access layer is covered by normal component resilience testing.

6.5.2 Server Connections

6.5.2.1 BladeFrame

All server instances on BladeFrame reside on the Core switches, and the BladeFrame is not physically
connected to the Access switches.

Itis necessary to distinguish between the management interface, which is a single connection to each
Blade, and the network ports available for pServer instances. Each BladeFrame will attach on cBlade to
the header and one to the footer. For out of band support an Aurora Console Tower system (CON) will be
provided that allows secure, managed serial connections to the serial port on the control blade.

Each BladeFrame has two management ports, one on each cBlade, and these are connected to different
data-centveData Centre switches. The PAN Manager service has a virtual IP which fails over to the
master cBlade. The Blades provide proxy arp resolution for each other, and simple ping tests are an
Unreliable way of tracing network problems.

Each cBlade has four on-board and four PCI based gigabit NIC for use by the pServers. These are
grouped into resilient Ethernet interfaces (rEth) which may also span BladeFrame chassis as mega-rEth
(mrEth) if a BladeFarm has been formed to allow a pServer to move between chassis,

Virtual switches are created within the PAN, and these are identified with the NIC that traffic is passing
through and the security domain of the VLAN ID, e.g. vSwitch1_DB or vSwitchi5_SAS. The LPAN
‘Administrator will ensure that servers are only permitted access to those switches associated with VLANs
in which that server resides. The PAN Administrator will ensure that Test VLANs are not visible to
Production systems and vice-versa.

6.5.2.2 Discrete Linux

Discrete Linux servers on RX300 will utilise one on-board Intel interface and one Broadcom 57xx NIC
interface, Broadcom drivers will offer an active/standby service.

Di aes “will-utilise Broad N d Broad LB Load Balance]

For details please refer to the Linux Platform Foundation HLD (DES/PPS/HLD/0002)

For out of band support an iRMC port will be provided, and a Fujitsu-Siemens KS1621 KVM that supports
connections over TCPIIP.

6.5.2.3 Discrete Windows

Discrete Windows servers on RX300 will utilise one on-board Intel interface and one Broadcom 57xx NIC

interfaces,_-and-run-in-Broadcom SLB -{Smart-Load-Balance}-rodedtivers will offer an active/standby
hi Pr Vind

service. RX300 plat hor th
For details please refer to the Windows 2003 Platform Foundation HLD (DES/PPS/HLD/0001)
Copyright Fujtsu Services Lia TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \\MERGEFORMAT] — ate! iat tao 0n7

UNCONTROLLED IF PRINTED. Page No: 44 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0043
[TITLE \* MERGEFORMAT }
FUJITSU [SUBJECT \" MERGEFORMAT ] @&

For out of band support an iRMC port will be provided, and a Fujitsu-Siemens KS1621 KVM that supports
connections over TCP/IP.

6.5.2.4 Solaris

Each Solaris server will have at least dual NICs presented one to each switch. It is usual on larger
servers to deploy two quad port ethernet cards (fiqe) so that in the event of port failure a separate port is
readily available and the replacement is a simple card swap.

The PrimePower PCI bus only has two slots cable of taking gigabit class cards, and these are usually
occupied by the HBAs for the SAN, so PrimePower systems will not normally be connected via gigabit
‘connections. It is however getting common for systems even as small as the PW250 to have two gigabit
ethemet connections on the motherboard. The use of these provides a certain level of resilience e.g.
against switch or cable failure, but is not fully resilient.

Solaris ipmp will be used to provide network multi-pathing. This presents a base IP address for each
card, and a third virtual address which fails over to the active card. Multiple virtual addresses may be
overloaded onto a single interface,

Solaris 10 provides Layer 2 detection as well as ping detection which is preferred. In this case the base
addresses are not needed. The Solaris Platform Foundation HLD (DES/PPS/HLD/012) defines this as the
method to be used for HNG-xHNG

Solaris 9 systems (POLFS) that migrate to Belfast will continue to use the three address ipmp method,

For out of band support _an Aurora Console Tower system (CON) will be provided that allows secure,
managed serial connections.

6.6 Counter Access
Branch Access HLD (DES/NET/HLD/0014).

6.6.1 Counter network access

6.6.1.1 VPN
DEV/NF/LLD/O022

Horizon counters connect via a single gateway machine in the Branch, and encryption is provided by
Utimaco VPN. The longer term goal is to replace this with Branch Router and SSL encryption, but the
VPN solution needs to be retained until the counter is complete and the operating system has
been upgraded to XP.

There are enough VPN servers at each site that a single site has N+1 capability.

6.6.1.2 Branch Router
Branch Router HLD (DES/NET/HLD/0010)

The branch router is an ‘off the shelf router. Its primary purpose is to provide seamless failover to a
backup GPRS network should the standard (usually ADSL) network not be available.

Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: 25iah Jute 087

UNCONTROLLED IF PRINTED. Page No: 45 0f 126

POL00397094
POL00397094

(Formatted: Highlight )

POL-BSFF-0223764_0044
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

It has no high-availability features built-in. If the router were to fail, an engineer will visit the site and
install a new one. This is no different to the Gateway PC failing in Horizon, but it is estimated that the
likelihood of appliance failure is lower, and general availability should be improved

There are multiple ways for the router to connect to the C&W: ADSL, IDSN, PSTN, GPRS (3G). If the
primary connection method were to fail, then the router will automatically connect via an altemative
method.

6.6.1.2.1 ADSL
The router connects to one of six LNSs
The LNS peer with the C&W and the C&W MPLS network

6.6.1.2.2 ISDN/PSTN
ISDN/PSTN connections dial in to C&W routers at diverse sites round the country.

IDN: ISDN was supposed to have been done away with at HNG-xHING-x but in some areas it is still the
only feasible mechanism]

6.6.1.2.3 GPRS/3G

‘Two GGSNs (Gateway GPRS support node) in different locations each connected to different C&W
POPs

6.6.1.2.4 VSAT
[DN: Looks like we'll still have them.]

A small number of remote sites use satellite connections. The long term goal is to replace these
‘connections as ADSL coverage improves.

6.6.1.3 Network connection to data-centreData Centre

The counter traffic is passed through the C&W cloud and makes a connection to an end-point in the data
centreData Centre Branch DMZ. This connection gets authenticated by a RADIUS server. There are
RADIUS servers at both data-cenireData Centres operating in active/active mode. One reason for this is
that in the event of a site failure the counters will start polling to reconnect, and if no RADIUS server is
present they may continue polling for an extended period, which puts an undue load on the extemal
supplier.

6.6.1.3.1 Network triangulation

If the C&W link to the primary data centreData Centre fails, then traffic will be re-routed via the secondary
date-centreData Centre and across the intercampus link

6.7 Power
The power provided to the data centreData Centre sites can be summarised as follows:
@Copyright Fujitsu Services Lid [SUBJECT \* MERGEFORMAT ] Ref: [TDOCPROPERTY
20087 “Document Number \*
ERSEECRMAT
Version
[KEYWORDS \* MERGEFORMAT] Dele: Bie, ~dulyMayNow-087

UNCONTROLLED IF PRINTED. Page No: 46 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0045
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

The Primary Site is at IRE11. This has two feeds from independent substations to onsite transformers.
The site is capable of running for three days at peak load when on generator power, or longer provided
fuel deliveries to site are possible. IRE11 has two computer rooms (or Tech Halls) which are physically
sepatate buildings. TH2 is the site for the majority of the HNG-xHNG-x equipment, and is fully N+1
resilient for UPS and generator power.

TH’ is connected via separate paths of single-mode and multi-mode fibre-optic to allow the use of a row
of cabinets for RMGA equipment. TH1 has N+‘ resilience in UPS, but only a single generator. A second
generator has been freed by the uparade of TH2, and after servicing it is planned to connect this to TH1
to provide full resilience. In practice the double failure of NIE supply and generator is highly unlikely, and
there is a programme of generator and UPS testing each year to ensure that the generator is serviceable.

The Secondary Site is at IRE19. This has a feed from a single substation with the transformer offsite. The
site is capable of eleven days at peak load when running on generator.

Neither site shares a common substation and Northern Ireland is a res
inter-connectors to Scotland and England.

lient part of the National Grid with

There is an on-going programme of site improvement which is customer driven.

SOopyright Fujtsu Senices id TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number” \*
MERGEFORMAT]
Version:
. Date: Hatz dulaov087
UNCONTROLLED IF PRINTED TKEYWORDS V"MERGEFORMAT] —— PegeNo: 47 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0046
[TITLE \* MERGEFORMAT }
FUJITSU [SUBJECT \" MERGEFORMAT ] @&

7 Disaster Recovery
Basic Principles

In HNG-XHNG-X the data-centreData Centres will be run in an Production/Test mode. All processing
normally takes place at the primary data-centreData Centre. In the case of a disaster all processing
activity will be transferred to the secondary site.

A disaster is an event which renders a data-cenireData Centre incapable of providing the ser

Business Continuity Planning (FS/SAU/SPR/004SVM/SDM/SIP/0001) will define what events are
deemed to be significant enough to trigger a decision to fail over. Whilst a number of events are clearly in
this category, generally this process is non-deterministic and the Service Manager must make a decision
and agree it with the Customer before invoking failover.

A regular schedule of testing is agreed each year with the customer both for component resilience and for
site failover. As familiarity is gained from operational experience some tests may be scaled back to a
procedural walk-through, but there will be at least on full site DR per year. This is driven by the Business
Continuity Test Plan (SVM/SDM/PLA/0003).

Failover will cause any testing that is in progress to be halted.

If the primary event has not caused loss of service, then the failover surely will, and the actual start of
failover may be delayed until outside core hours to minimise the business impact. For this reason also the
normal start time of site failover tests or planned failover for maintenance will be around 0200 on a Friday
Salurday with a failback at 0200 on a Sunday.

The secondary data-centreData Centre will provide at least as good service as the primary one.
Exceptions:

Failure of the C&W connection into primary data-centreData Centre is provided by connection to
secondary data-centreData Centre and inter-campus routing. In the event of a failure of the link to the
secondary daie-cenireData Centre DR is inhibited. In the event of DR the C8W connection is not N+1

Failure of connections to 3 party services is-atfected-similarty-as_thesethat rely on C8W_is affected
similarly. Since no counters would be able to transact this is moot, but may trigger separate penalties.

Power supply following DR is not fully N+1.

7.1 Service Level Targets

The requirements for Data-GentreData Centre Service Availability are given in the Data-CentreData
Centre Operations Service: Service Description (SVM/SDM/SD/0003).

The timings for failover design are assumed to run from the time that disaster recovery is authorised.

No time has been allowed for decision making, but until the design has enough detail to be able to
determine how close it is to the 2 hour target for having branches operational itis not possible to quantify
this as a risk. As previously noted, a decision to fail over may be deliberately delayed if the service is
degraded rather than fully unavailable,

For the purposes of providing an end-poit

which has meaning to most people in all areas of the solution

(architects, designers, operations, service managers, customer) a working definition has been used of the
moment that the first counter is able to receive a network banking authorisation message. This is not the
Copyright Fujtsu Services Lid TSUBJEGT ¥ MERGEFORMAT] Ret TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
[KEYWORDS \'MERGEFORMAT] Date: 25a uta 087

UNCONTROLLED IF PRINTED. Page No: 48 of 126

POL00397094

POL00397094
_ (Formatted Fonts Rae, Highlight }
(Formatted: Font Rae, Highlight
(Formatted: Font: Italic )

POL-BSFF-0223764_0047
FUJITSU

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

same as the strict service measure, and a means will need to be developed of assessing the actual

service outage.

SGopyraht Fujisu Senices Lid TSUBIECT ¥ MERGEFORMAT] Ret TDOGPROPERTY
20087 “Document Number" \*
MERGEFORMAT]
Version:
. Date: Bata duvilaov 087
UNCONTROLLED IF PRINTED TKEYWORDS V" MERGEFORMAT] —— PageNo: 49 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0048
POL00397094
POL00397094

[TITLE \* MERGEFORMAT] *
FU TSU [SUBJECT \* MERGEFORMAT ]

7.2 Disaster Recovery Process Overview

aia : canagaRT aaa ,
Vc I ae a
eZ i @ a = ee] Tl
nite Remcreerwas I opasrervame I orm, I term I any I Smee I TOP
ponies aaron

TEST oH eVONTTOTEST)

aii 7 I ose reeves
j

1 oor {pons I hata per Cie
as I a

\ I ! Ss

' 1 Even
1 { 1 mst
\ I = - reece
I ' = j j caseigaioss
! . ae] la, wranee”

' I 1 iy # Sento ton
\ I IVE i “I I cetutaenene
' Fle oe com Rata cr 8?
\ nce

' POPS: CS/OLA049 “wrath
\ Pmt

t [rate tne od

\

'

‘Copyright Fujitsu Services Ltd TSUBJECT \ MERGEFORWAT] Ret [DOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT I
Version: 0.432
te: 23423-JulyMayNov-087

[KEYWORDS \* MERGEFORMAT ]

Date:
UNCONTROLLED IF PRINTED Page No: 50 of 126

POL-BSFF-0223764_0049
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

In order to allow business critical services to be given priority the conceptual overview presents a similar
view to the migration design, and groups services into a small number of logical containers. These will be
‘enumerated later. All that is important in this overview is that some services operate Active/Active, and
some services have a permanent presence at one or other data-centreData Centre, e.g. PRODUCTION*

It is impractical to make the data-centreData Centre pair separate completely into Production and Test
and still achieve such a high availability, and the design represents a compromise that minimises the
amount of services that must be "warm". The design goal for eliminating these is that their equivalent
must then also be present in TEST*, and this represents extra equipment that must be purchased,
maintained and supported.

Systems required to permit or facilitate support staff access during DR must be available at the
secondary site. These include the SAS terminal servers, AD servers, and systems to facilitate network
and storage management. These will all be included in the PRODUCTION’ set of systems, and will be
available without interruption following failure of the primary site.

Until the AD FSMO Roles are failed over passwords cannot be changed, systems cannot change
domains or new domains be added, and users cannot be added. Services which need to become
available quickly should not rely on any of these features.

POLFS is effectively still deployed as an independent system capable of independent failover.

DN: Not clear wheth 1 AJA is be hanged tof di ds pi _—{For iz Font: Not Italic )
to- 4or-POLFS,- but-this-has—no-practical-impact-on-HNG-x-service-to-the

“availability
It is possible for two Normal states to exist, one with POLFS Productive at IRE11, and the
other with it at IRE19.

Itis also possible that the failure at the primary site could be such that there is no loss of site, but rather
loss of a major infrastructure component like an EMC array or a BladeFrame. In this case it is likely that
the WAN triangulation would be left operational and POLFS Productive would remain in IRE1%

__-( Formatted: Font: Not Italic )

In the event of loss of IRE19 there will be a loss of the testing service, including POLFS QA. The service
is expected to continue without interruption, although there will be a loss of resilience.

7.3 Disaster Recovery procedure

7.3.1 Site Failover

Site failover is covered by a Major Incident Process. Much of this is unchanged from Horizon, and the aim
of this section is to give an overview of the process and identify new processes which need to be
developed.

‘Some of these steps will be capable of being carried out in parallel. Major checkpoints will be included in
the business continuity plan to allow coordination of steps which depend on each other.

Testers will be given as much notice as possible to stop their testing and shut down their systems
cleanly.

‘A message will be put on the Help Desk phone to inform Post Masters and Mistresses of a major problem
Authorisation will be sought for failover
Availability and operation of support services will be confirmed.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23123-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED Page No: 51 of 126

POL-BSFF-0223764_0050
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

If possible production servers and LPANS at the primary site will be shut down cleanly, followed by Test
LPANs at the secondary site.

The test network will be disabled and the production network prepared for operation from the secondary
site

The storage will be failed over.
The Production LPANs will be started at the secondary site
Business critical services will be started

Other services will be started

States of sites:
Normal Running:
Primary — Production
Secondary — Test/Production*
After Disaster:
Primary — non-functional / unavailable
‘Secondary — Test/Production*
After Failover:
Primary — Unavailable
‘Secondary — Production / Limited Test
After Failover and after primary site fault is fixed:
Primary — Standby
Secondary — Production / Limited Test

7.3.1.1 Authorisation and notif

Once Fujitsu has determined that site fail-over is necessary, the Customer will be informed through the
normal mechanism for a major incident. Once approval to fail over has been received the process is
imeversible. The SMC will manage the interaction of the various support teams, and keep Service
Managers informed

ication

Note that there is a new requirement for SMC over Horizon, which is to inform the Test community if
‘events are observed that indicate DR is likely to be invoked.

7.3.1.2 Network

Itis important that the counters do not get directed to the primary site once failover has started, Although
generally we will have caused Branch Database to be unavailable there are still conditions where the
inter-site link -may have become partitioned and we need to inhibit transactions to the site we are about
to. abandon,

The VIPs for all counter services wil
C&W with a lower cost.

be advertised from the secondary data-centreData Centre to the

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \* MERGEFORMAT] pee 28125 JulvMayNov-087

UNCONTROLLED IF PRINTED PageNo: 52 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0051
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

IDN: Not sure that this is a guarantee. How do we guarantee it? Does it matter?)

Testing access to the secondary site will_need not be disabled_as the authentication domains are
separated and testers will only be able to authenticate if their service is explicitly started.

DN: Need to see final Network design]
MSFC's supporting the extended VLANs JON: We-may-not need-to worry about this They have nothing to
eit asia)

Subnets-used by BladeFrame hosted services will be disabled_at the primary site-and-enabled aifailed
‘over to the secondary site.

7.3.1.3 Storage

The Solaris backup servers will operate as Storage Management systems to enable failover of storage to
be managed through scripts which will be called out by a master checklist.

All BladeFrame boot LUNs are synchronously replicated from the primary site to the secondary site.

All storage that requires zero data loss on
secondary site.

Replicated LUNs are normally ReadMrite on the Primary site and Read-Only on the secondary site.

Storage is managed by permitting an LPAN to use certain disks. The Production LPAN can only use
disks which have been allocated for Production, and each test rig can only use disks which have been
allocated for that rig.

Discrete servers will not be moved from Test to Production without a complete rebuild, which will include
changing the VLAN they are connected to and the storage that is presented to them. This storage is
typically managed by LUN Masking, an operation which can only be performed from the Storage
Management Server.

The storage “failover” command will be issued which will write disable storage at the primary site and
write enable storage at the secondary site. This also inhibits replication until it is deliberately reinvoked.

Wver is synchronously replicated from the primary site to the

At some point after failover an assessment will be made as to when and whether failback is possible.

At this point the direction of replication will be reversed to replicate changes made at the secondary site
(now in a Production state) back to the primary.

In Horizon the only reverse replication mode available meant that full resilience was not achieved until
after failback, but at HNG-xHNG-x \ciple possible to achieve a state where the Production
service running at the secondary site is able to treat the primary site as a DR target. This allows a more
‘extended period of failover running, for example if itis desired to perform failback only on a Sunday.

IDN: The risks of such an approach are difficult to quantify until more LLD work on LAN and SAN is
complete. In any case this incurs a higher outage for Test systems which may have programme
commitments for fixes or enhancements.)

It is also possible to put the storage into a split state in a controlled manner to maintain an image of the
data at the secondary site, for example during migrations where a regression image may be required.

Scripts will be provided by the storage vendor to provide a toolkit to simplify:
Failover
Reverse replication

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23123-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 830f 126

POL00397094
POL00397094

_—( Formatted: For: tae, Highight }
_—{ Formatted: Font: Not Italic )

POL-BSFF-0223764_0052
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Failback
Campus split
Campus establish
Campus recover

These will be documented in DEV/PPS/LLD/00??

7.3.1.4 BladeFrame - Active

No action is required. It may be desirable to inhibit or shut down services at the primary site, or prevent
traffic being routed to them.

7.3.1.5 BladeFrame - Production/Test

Testing on the secondary site will be stopped and all Test LPANs will be shut down. There is a lower risk
of collateral damage to the test system if it is shut down cleanly, and the Test Manager will be consulted
‘as soon as a major problem is identified to initiate shut down.

If access to the primary BladeFrame is available on the primary site, all Production LPANs running on the
BladeFrame will be shut down. This enables storage to be in a “clean” state when failed over, which as
well as being lower risk is also quicker as the filesystems do not need to be checked.

The Production LPAN will be prepared on the BladeFrames at the secondary site. This will involve
reassigning pServers and starting Xen hypervisors using a script on the control blade.

The configuration of the BladeFarm is stored in a XML file. BladeFrame DR allows this to be saved to a
SAN disk by an internal scheduler. In the event of difficulties the configuration of the primary LPAN may
be recovered from the backup copy.

When storage failover has occurred the Production LPANs will be available to start.

Some resource configuration may be necessary. Such differences will be minimised and scripts will be
developed for use during failover to speed up the process and minimise human error.

The individual servers in the Production LPAN will be brought up in a controlled manner in an order that
achieves a minimal outage of the Branch service. This is enumerated in Section 7.4. In order to minimise
the overall outage smaller, stateless servers may be set to start automatically with the LPAN.

As far as possible built-in LPAN start up ordering will be used, but it may be necessary to coordinate
between frames, or to have a finer granularity. Unfortunately it is too late to put a requirement into
HADDIS that services should start up and then poll, although many of them do behave in this way.

7.3.1.6 Legacy Batch Server

At the secondary site the standby legacy batch servers will not be used for testing. They are only in
place for site failover purposes.

The server that provides N+1 resilience at the secondary site may be made available for volumetric
testing. If it has been it will be rebuilt to provide Production N+1 resilience.

IDN: Actually doing this rebuild regularly as part of DR testing is no bad thing]

The standby server will be running [ HYPERLINK “http:/www.google..co.uk/search?hi=en&
a&ris=org.mozilla:en-

GBofficial&hs=ZkB&sa=X80i=spell&resnum=08ct=result&cd=1&q=permanently&spell=1" ] but will at a
lower than normal ‘run level’. When storage failover is completed the standby will be raised to the active

SOopyghtFateu Senices Ld TSUBIEGT V WERGEFORMAT] Rat [DOCBROPERTY
2008 ‘Qocument Number \
MERGEFORMAT }
‘Version: 0.432
[KEYWORDS \'MERGEFORMAT] Date: I 23128-JulewNov-087

UNCONTROLLED IF PRINTED Page No: 54 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0053
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
run level. This will automatically mount the disks used by the applications, and start services as required,
e.g. Oracle databases and the Oracle TNS*Listener. ACE will detect the TNS‘Listener and advertise the
VIP for the failed over service.
Note that this is substantially unchanged from Horizon and is similar for POLFS systems.
7.3.1.7 Discrete Servers
Ther ther discrete hich failover.The remaining discrete servers fall into a
number of patterns
Boot from SAN (ECC! + (Formatted: Indent: First line: 1.27 cm J
Active at both sites with either able to offer a service (NMN, DX!)
Like DAY

‘These services will be dealt with in detail in the service summary. The most significant are actually the
Hydra and SYSMAN2 platforms, and these are typically "Like DAT"

7.3.1.7.1_ NetBackup Master Catalogue Service

The NetBackup realm consists of a number of media servers (servers attached to tape drives). In order to
allow any of these to be used to restore any backup NetBackup provide a centralised Master Catalogue
Service.
the-server-hosting-the-Master-Catalogue-Service-will-operate-a-similar-type-of-run-level-controlled
[ON Pt BSM-in-the Black to-simplty-failoverfThis will be hosted on
the BSM platform which is BladeFrame based

7.3.2 Last Resort —- Recovery from Backup

The backup images are provided in order to protect against application corruption. They have not been
designed to provide a failover capability, but the very fact that an application image is available means
that there is a further mechanism for restoring service.

If this method is invoked it is almost inevitable that some data loss may occur, so systems which require
DR with zero data loss may not rely on a backup image.

It is also inevitable that if recovery from backup is invoked that the service outage will considerably
exceed that allowed.

If this recovery is invoked it will be done so on the understanding that restoring any service is better than
continuing with a service outage, but the ramifications particularly for the application support team in
assessing the impact on audit and reconciliation should not be under-estimated.

7.3.3 Site failback

Failback has been alluded to earlier in 7.3.1.3 when discussing reverse synchronisation of storage. This
is an essential precursor step, and if the outage has been lengthy this step in itself may take some time.

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNow-087
Page No: 85 0f 126

UNCONTROLLED IF PRINTED. TXEYWORDE: 'Y MERGEFORMAT

POL-BSFF-0223764_0054
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Following a failover, or more likely following a business continuity test, a point will be reached when the
business is ready to failback. It is not possible to be prescriptive, as certain events like flooding can
require considerable work to make the primary site available again, and the degree to which the
infrastructure may be tested before failback may be limited,

Itis likely that active/active services can be restored in advance. These are not business critical, but are
essential to providing a secure managed service.

Itis possible to state the following
‘+ Fallback is a planned event.
‘+ Fallback will cause a service outage.
‘+ Fallback may have to be abandoned.

The high level process is very similar to failover, except that the Production services are guaranteed to be
running

‘+ Ensure reverse replication has completed

‘© Confirm availability of support ser

es at both sites

‘+ Automatic message on help desk and maybe memo earlier during the day
‘+ Shut down Production services at secondary site

‘+ Restore Production Network to normal state

‘+ Fallback storage (write enable at primary site)

‘+ Start production services at primary site

‘+ Confirm services available

‘+ Permit customer connections (point of no return)

‘+ Enable Test network

+ Start Test services

After the point of no return production services would be expected to remain at the primary site.
‘Abandoning the failback will only occur prior to this point.

[DN: How do we confirm service is available without allowing an actual transaction?]

7.4 Service Start-up Ordering
IDN: Add an intro section to reiterate the SLA requirement for DR]

[DN: The list of services is_not yet complete. As part of the Integration & Build process service
dependencies are being established. This section will be updated in the next version if required.]

WAN, LAN, SAN (NMN, ECC, RSG)
PAN (CON)
AD, DNS, SSN

POL00397094
POL00397094

Above this line is very difficult to get going, It represents a minimal “core” service. _—[Formatted: Underiine

EST, RADIUS, VPN
Key Management Service

SCopyright Fujtsu Senviees Lid [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \'MERGEFORMAT] —_ Date: Boies ulvayNov 087

UNCONTROLLED IF PRINTED. Page No: 86.0f 126

POL-BSFF-0223764_0055
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

BRDB [RIPOSTE]
NPS, NBAuth, Atalla, INRAI

BAL

Other Auth Agents: DCS, ETU

Client connectivity: DCM, FTMS, C:D

Web Services: DVLA, PAF, APOP, Track&Trace, Kahala, Telecom, Online Training, Helpdesk
APOP, LFS, TES, DRS [AGE] - Note that TES Harvest from NPS can be disruptive in catch-up

SYSMAN3 / SYSMAN2{DN--The-H fe the-t Buil
process service dependencies are-being established. This section will be updated in the next version if
required}

IDN:-This-strictly:lists-plattorm types-rather-than services} & EACRR - What is EACRR managing now?
TWS/MSH

RDMC, RDDS, TPS, APS, DWH
Backup

Need to get the top-level view of services from the customer perspective so as to say what we now are
offering at each stage.

7.4.1. Supporting Services

‘Some systems need to be active at both sites because they are needed on the secondary site to support
site failover, therefore the following will be running on at least one server at each site (now probably in
BladeFrame):

SSN/SAS _SAS are active/active and should already be available

DNS DNS will be operational as it is active/active with a master/slave relationship. If changes
are required then the slave will need to be promoted.

AcD Active directory / Single Sign on / ID Management is active/active and should already be
available. FSMO Roles need to be started for full functionality.

DOM Hydra domain controllers

RAU. RADIUS Accounting authenticates connection to network devices

NMN- Network Management

VPN VPN will exist beyond Hydra as the NT4 counter requires the protection of the Utimaco

layer. The VPN systems will remain as in Horizon and will be hosted as quests in discrete
Microsoft Virtual Server Hosts (VSD). Supporting services such as the policy manager
(VPM) exception server (VEX) and loopback workstation (VDW) will be similarly hosted.
VPN is logically located in the triangulated Branch LAN.

RADIUS ‘There are several types of RADIUS server, but these are all active/active and should
already be available. This is necessary to prevent counters continuously polling for a
connection which appears like a denial of service attack on the branch network provider.

“Document Number" \*
MERGEFORMAT ]

Version: 0.432

Date: -JulyMayNow-087

Page No: 87 of 126

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087

UNCONTROLLED teRiNTED [KEYWORDS \* MERGEFORMAT]

POL00397094
POL00397094

POL-BSFF-0223764_0056
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

‘The Radiator product is able to cache authentication information and does not require
EST to be deployed except for new connections.

7.4.2 Branch Critical Services (HNG-xHNG-x)

What do we need running to offer a banking service? What do we need running to prove we are running
a banking service?

KMN____Key Management Server

BAL Does BAL come up expecting BDB or does it poll?

BMX May be required to manage BAL into service

VPN_Hydra-syetem-aciivelaciive.

RADI: The f RADI: _but-the m -ad-elrould:
already be available,

BranchDB Oracle RAC in BladeFrame

NPS Network Persistent Store. Oracle RAC in BladeFrame.

NAA AL Auth Agent

NAL Link Auth Agent

NAC CAPO Auth agent (x2)

HSM activelactive networked Atalla key generator

BranchStandby This is not strictly critical, but unwise to offer a service without it operational,

7.4.3 Branch Critical Services (Hydra)

These services are mainly active/active and no action should be required, but SYSMAN2 failover and
Maestro Scheduler failover may require some sort of linking in activity.

OMD, TMR (Manage EACRR)
Kms

1 ___Active/Active

COR
AGE
NRA
KMS wil il exist may be. one of he. external systoms ocauae of the ar or o
KMS,
ACE
cM
oMD.
DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 88 of 126

POL-BSFF-0223764_0057
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

sps

7.4.4 Batch Services

DRS and TES are required to allow POL to have a view of transactions. TES should be no more than 15
minutes behind NPS Journals. DRS receives C12 messages which SSC use to alert on transaction
codes for detecting failures.

IDN: Will the informal SSC detector be made formal in HNG-xHNG-x?]
APS dealt with Quantum Emergency payments, but these should be defunct at HNG-xHNG-x-Agreed—
they've-gone.

LFS deals with pouch deliveries and collections as well as planned orders.-Can-certainly live without
them:

RDMC may be passing Bureau de Change changes or memos Bureau changes are typically only once a
day, but memeos may be important if we want to communicate with Postmasters and occasionally there
is an urgent Bureau update.

TPS, DW and RDDS are pretty much strictly batch.

Note that Batch systems have their own OLA and SLA, and itis possible that outside core hours this may
take priority

Will SYSMAN2 will still operate primarily from IRE19-
a ee ee

th Bootle-a6-th

MSH is the Hydra maestro scheduler.
DAT Thisis a single host with a number of databases:

TPS +

APS

LFS

TES

DRS

Dw

RDMC

RDDS

Generally once the databases and Oracle listener are up support staff consider the job as complete, but
there are then a number of extemal systems whose connectivity needs to be checked, especially the
FTMS gateways and the TWS scheduling system.

‘Some of these databases offer a pseudo on-line service, such as LFS pouch tracking, and ROMC Bureau
de Change rate changes, or support direct on-line services such as TES Query and DRS "F99" by BSU.

7.4.5 Other Services
Branch Support (STREAMS) - any SLA on reporting?

Copyright Fujisu Services Lid [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version:
Date:
Page No: 59 0f 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

( Formatted: indent Fist ie: 1.27 on )

POL-BSFF-0223764_0058
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

NetBackup Master Catalogue Service
Web Services: DEA/DCS, ETU, PAF, DVLA, MoneyGram, BWS, HWS, OWS
FTMS: EDG, TIP
CDG _C:D Connect Direct Gateway for delivery of banking files.
DCM _ Interface to Streamline
22DGS-APOP
Track & Trace
CF - Hydra

OCM - Hydra
DEL - Hydra
iMoneyGram

7.4.6 POLFS

POLFS operates with a 48 hour SLA on DR service outage (CS/OLA/049), primarily because of the time
required to catch up on batch processing.

There is a DR facility for the Production service. This is provided by Fujitsu, and is designed such that it
does not require the direct involvement of PRISM, but would benefit from their support In the event that
the Production service cannot be operated from the platform within the Bootle Data Centre, the Tes/QA
service in the Wigan Data Centre will be closed down and the Production service built on the TesvQA
Platform,

‘The Service Target is to have the system operational within 48 hours. POLES is capable of failing over
independently of a campus failover as it would be undesirable to invoke campus failover because of a
failed SAP system, and because the Production/Test POLFS system is in effect a complete, independent
service to Post Office and the Test system is not used by Fujitsu Services.

‘The DR facility is invoked by either Fujitsu or PRISM logging a Helpdesk call. Fujitsu and Post Office will
then assess the request and obtain appropriate management approval. The service availability target
states that no single outage should exceed ten hours. Thus this is a key criterion in assessing whether
DR should be invoked.

There is an exercise going on as part of migration planning to determine just what business impact such
an outage incurs, as many of the business fallback (manual) processes developed for the predecessor to
POLFS will have changed radically.

There is a hand-off which is outside the control of Fujitsu Services, as once the SAP service is confirmed
as available it is Prism support who manage the catch-up of the batch processing. The Service Target
has not been well written in this respect.

!Prism Alliance has a 24 hour SAP outage every quarter which affects POL FS. Currently Fujitsu Services
do not align to this; however itis likely to be considered as an option in the future.

In principle the initial database failover of POLFS is straightforward, and it is only the fact that this is
generating activity when many more critical systems are being failed over that makes it desirable to put it
in the low priority queue.

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \\MERGEFORMAT] ate! Bie Wuvileov-057

UNCONTROLLED IF PRINTED. Page No: 60 of 126

POL00397094
POL00397094

—(Formatted: Portuguese (Brazil) )

POL-BSFF-0223764_0059
FUJITSU

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

There-is-also-a-hand-off-which-is-outside-the-control-of Fujitsu Services,as-once the SAP-service-is

lable itis Pi

IDN: Confirm whether the 48 hours is for catch-up or just for the Fujitsu Services component]

DCopyignt Fujisu Services Ld TSUBIECT ¥ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number” \*
MERGEFORMAT ]
Version: 0.432
. Date: 23423-JulvMayNov-087
UNCONTROLLED IF PRINTED TKEYWORDS V" MERGEFORMAT] —— PageNo: St of 126

POL00397094
POL00397094

POL-BSFF-0223764_0060
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

8 Monitoring / EACRR Replacement

The overview and context is provided by ARC/SYM/ARC/0001
ARC/SYM/ARC/0003 is the topic architecture for System and Estate Management - Monitoring.

Some terms appear in this chapter which are not in the Abbreviation or Glossary section of this
document. This is because Chapter 10 is in effect a glossary of the platform types enumerated in
DES/APP/HLD/0009,

8.1 Event gathering and collation
The framework can be described in several perspectives:

‘+ Active monitoring — where agents are looking for known stimuli (for example service down,
processor utilisation exceeded).

‘+ Passive monitoring - where events are being raised at source, and can be classified, aggregated
etc and forwarded to a collection layer for subsequent aggregation, display in a event viewer and
possible forwarding to a business monitor service.

‘+ Business service monitoring — which provides an aggregation of underlying events in business
service terms

For each perspective we can describe the product architecture that delivers it as follows.
The Active monitoring perspective is provided by the ITM (IBM Tivoli Monitoring) product

The following is illustrative of the active monitoring infrastructure:

ITM Solution

EMM Be = eps

The products for the other perspectives are:
‘+ Passive monitoring - IBM Ominibus products
‘* Business Service monitoring — IBM Netcool RAD.

The following is illustrative of the passive monitoring integration with business service monitoring.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.482
Date: -JulyMayNov-087
PageNo:  620f 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0061
[TITLE \* MERGEFORMAT ]

FU TSU [SUBJECT \* MERGEFORMAT ]

Event

4.000 wakstaons each ning 750" mut proceso sara oach rng:

je ee otc Lap wn ace RDS a Ne

{An MUA agen omontor Wd Sardoos “Tri'o3, UA anor mont or
25/06/2006 13:12 Le, SPbloNSRORMS gars :

Links will be made from the EMD to the KEL and incident management system so that event displays can
be enriched with KEL data and incident raised

Events have business , operational and security significance . The moni
and deployed to meet these needs

augmented by specific tools *

ring solution will be configured
In some cases - such as security event monitoring - it will be

The products can be deployed to construct a tiered solution managing a variety of domains including

Platform hardware —
‘Operating system — Windows and Red Hat

Bespoke Applications — both explicit application events and heuristic conclusions from.

transaction throughput
Oracle -

HP Openview — and SNMPSNMP domains
‘Storage solution —

Middleware — Interstage

High level batch scheduling

+ SAP
Sopyight Fujtsu Services Ld [SUBJECT ¥ MERGEFORMAT] Rae TDOCPROPERTY
20087 "Document Number” \°

MERGEFORMAT]
Version: 0.432
" Date:
UNCONTROLLED IF PRINTED: TKENWORDS! T HERCEFORMAT Page No: 63 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0062
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Each domain and its managed objects are appropriately configured and introduced into the monitoring
framework to populate event displays and higher level business service monitors, whose status is
informed by the individual status of managed objects contributing to the service availability.

8.2 Service Control

IDN: Need to describe the various options, and how they tie in to the applications. A number of models
exist in Horizon; should we be standardising on a few pattems; more likely to summarise the models in
Horizon and make them available for patterns. A brief summary is presented here but itis not intended to
be exhaustive and needs more architectural steer]

The focus in Horizon was on the technology being used to raise the event. The design goal in HNG-
xHING-x is to focus on the managed object, and design the event filtering to ensure that SMC are
presented with the business impact rather than the underlying cloud of events

8.2.1 Things not currently viewed as services

8.2.1.1 VIP/CSM
Services are monitored by the Cisco ACE which publishes VIPs and directs requests.

8.2.1.2  Interstage

Interstage will be monitored by JMX?, which will report into the System Management systems. Two main
systems on the BAL, the Interstage services and the Java on-line routing agents. If Interstage were to
fail, either completely or partly then the whole server should be restarted.

8.2.1.3 Systems

A variety of methods are used in Horizon

BMC Patrol (Oracle and Solaris and SAP)

Compaq Insight Manager

Fujitsu-Siemens ServerView

EMC Enterprise Control Centre, network device, and clusterware raising sarmpSNMP traps
Unix and Cisco syslogs

8.2.1.4 Applic
Applications raise events in a number of ways;

ns

Directly to windows event viewer or unix syslog service

Indirectly to exception tables in the application database

Through an event to the Tivoli TEC or expedited TEC

POLES (SAP) events are raised via BMC Patrol which is an SAP approved mechanism.

SCopyright Fujtsu Seniees Ud [SUBJECT ¥ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432

. Date: -JulyMayNov-087
UNCONTROLLED IF PRINTED [KEYWORDS V"MERGEFORMAT] —— PogeNo: 64 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0063
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

8.2.2 Services which are completely autonomous

There should be none iof these. Every service should at least raise events that let SMC report on the
state of the system.

In Horizon there were a number of such services, for example the BT! paging system which was offered
as a print service on unix systems, and which sent pager alerts direct to the Unix Support Team. This
meant that migrating components to SMC management was complex, and was a hang-out from early
Horizon days when Tivoli monitoring of the data-ceniveData Centre was not mature.

8.2.3. Services which are autonomous but report events
In unix these are "respawn" services managed by init.d, in Windows these are referred to as “locally Tivoli
managed"

Examples are the Network Banking Authorisation agents. They may occasionally self-terminate, but an
event is generated to explain why, and they then auto-restart "4 2
managed-by-init.d,-in- Windows-these-are-referred_to-as “locally Tivoli. managed"}-and start offering the
service automatically.

8.3 Interstage probably fits in here rather than above.

8.2.4 Services which are managed externally

There are several sub-classes of these.

The EACRR Horizon system is Tivoli managing the failover of a pool of agent services on a set of
physical agent platforms.

The Maestro and TWS schedulers also perform this function. These are not generally event driven, but a
class of scheduled jobs has developed (such as the rates file arriving) which trigger scheduled jobs.

There are also still some local unix cron or windows scheduled tasks. These should not be permitted, but
certain systems e.g. cBlade PAN Manager do not allow agents (Tivoli or TWS) to be installed.

Tivoli can interact directly with systems through ssh. For example commands may be issued to PAN
Manager as an LPAN Administrator.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.482
Date: -JulyMayNov-087
Page No:  850f 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0064
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

9.1 Introduction

The migration strategy in ARC/MIG/STG/0001 foresees a protracted period where old and new data
centreData Centres are running in parallel as sets of services are migrated on successive week ends.
This is followed by a period of operation in the new data-cenireData Centre where the counter estate is
entirely Horizon, although the data contreData Centre is substantially HNG-xHNG-x. A small number of
Horizon data-centreData Centre services are migrated in order to support this until the last counter has
migrated to HNG-xHNG-x. In addition the counter will have an NT4 to XP migration, which may occur
some time after the HNG-xHNG-x application migration. Horizon Estate-Systems Management systems
(SYSMAN2) need to be retained until this period is completed as the HNG-xHNG-x SystemsEstate
Management systems (SYSMAN3) cannot provide all the required services to NT4 platforms.

Migration and Transition are periods between the current Horizon system and the final end-point running
only on HNG-XHNG-X.

Migration starts with Horizon running solely at Bootle and Wigan and ends with all services running from
Belfast. This period is expected to last for 6 to 8 weeks. The final state is a Horizon service running in
Belfast that is ready for HNG-xHING-x Pilot.

Transition is the period between the end of migration and there being no Counter using any Horizon
components. This includes the upgrade from NT4 to XP at the counter. Transition is complex and
protracted but is operating entirely outside the data centreData Centre apart from the demise of a few
services during the transition period. As such it does not generate extra states to be considered,

It should be noted that once-PC!-compliance is achieved the Hydra DCS agent willbe decommissioned,

(Formatted: Hight }

{Formatted Skethvoush )

[DN-Thi rm" ‘ he-dee! d'\ only plokod ieuptrom eran! Jides} (Fo f Fonts Not Ralic

Formatted: Fon: Not ai, No Fgh

LIU

is. NPS that is doing this PAN Encrypt ot Otiecatlonh aber fnen BROS. (think Joremy’e side may formatted: Fort: Not Rate
be The Horizon PCI Counters do not rely on the Branch Database. They use HNG-X style

messaging (via the BAL and using NPS) for transaction authorisation, but report transaction outcomes via
Riposte as normal.

‘The BDB is not used to obfuscate PANs - that is done on the Counter. For the PCI Horizon Counters the
necessary seed is transmitted as a software distribution encrypted under the GDK. For HNG-X Counters
the seed is provided at log-on by the BAL

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23423-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 66 of 126

POL-BSFF-0223764_0065
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

The Pilot will use real outlets. Whilst the number will be small the fact that they are involved in a highly
visible Pilot programme will mean that they receive attention out of proportion to the impact of failures on
the overall business.

9.2 DR During Migration

Itis a pre-requisite for migration that the Tivoli Workload Scheduler (TWS), the HNG-xHNG-x Systems
Management environment and the backup system are functioning, It may be necessary to demonstrate
SYSMAN2, SYSMAN3 and TWS DR prior to starting migration, although these are merely part of
Production LPAN DR described in 10.1.1.3 which will have been formally tested during ITU testing.

The high-level DR architecture has been deliberately presented in a similar fashion to the Migration
Architecture, with blocks of services.

9.2.1 POLFS

SAP is a German company. SAP staff tend to use the term "Productive" where an English speaker would
use "Production" and the two terms should be regarded as synonymous in this context.

POLFS is migrated in "Weekend A", which is actually three separate sub-phases for Dev, QATest and
Production. There may be more than one elapsed week between phases. Also Weekend A is now at the
end after Weekend DI

All POLFS servers are Fujitsu-Siemens PrimePower. The Production Central R3 server, and it's DR
counterpart are PW1500 systems, and the remainder are PW450 with 4CPU and 8GB RAM, except for
the IXOS servers which only have 2 CPU. XI servers (identified by a platform code of nws) have 16GB
RAM.

Data exists in the Athene LTPDB and in various SAP reports to show the usage of CPU and memory
during the batch processing and on-line windows, and is not of interest to DR provided similar systems
are available at the secondary site in a suitable state to become the DR target.

The database servers run Solaris9 and Oracle9IR2, but this is installed as part of the SAP install, and
DBA services are provided by the SAP Basis team not IRE11 DBA Team. Prism provide application
development and support to POL in Chesterfield,

Jthas-not-yet-been-decided whether aA “lift & sk

adopted on cost grounds. This has a significant impact on the abilty to provide timely DR and regression

path during the migration pe

‘Softek TDMF is providing the "Host Based volume replication” functionalitypitterent. hes-to-dat

deployed.

Readers should not waste time commenting on the time taken to migrate which is being discussed in a
‘separate forum specifically looking at POLFS migration. These sections are merely here to illustrate the
states for DR during migration.

9.2.1.1 Dev: Bootle -> IRE19 Weekend A-2

Dev consists of two servers, one running an SAP/R3 instance PLD and one running an XI instance DXI.
There are no separate Dialog servers and all DI instances run on the central system.

No requirement for DR of Dev has been stated in CS/OLA/049, and the Dev systems are not a DR target
for any other system, so there is no impact on DR of moving this service to IRE19. In fact in the event of

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \* MERGEFORMAT] pee 23125-JulvMayNov-087

UNCONTROLLED IF PRINTED. Page Ne: 67 of 126

POL00397094
POL00397094

ft" approach oF-a-"swing kit" approach-will-behas been+._—{ Formatted: Default Paragraph Font, Pattern: Clear )

(Formatted: Patter: Gear )

(Formatted Pater Gear }

POL-BSFF-0223764_0066
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

a real disaster the provisioning of replacement hardware and a restore from tape backup is the most
likely recovery. Provisioning hardware can take eight weeks.

In spite of the apparent insignificance of Dev, Fujitsu Services are aware of the potential impact on a set
of highly skilled and expensive users having the system unavailable, and also of the impact on any urgent
fixes.

9.2.1.2 QATest: Wigan -> IRE19 Weekend A-1

QATest consists of one servers running SAP/RG instances PLQ and PLE, three R3 Dialog servers, and
three servers running an XI instance QXl.

The R3 central instance server is the failover target for Production.
The XI central instance server is the failover target for Production.
The QATest R3 dialog servers can be switched to point to Production.
XI has two dialog servers which are failover targets for Production

When the QATest system has been moved to IRE19 (assuming no swing kit) the DR path for Production

is from Bootle to IRE19. Fhis-iavoh

Br Serer wl aed tober in Bast or the pai, an ho ikl tao rcover the
‘

database ica total of 30-houre_k 48h which ie-faiek
thatthe tapes from Wigan wi have boon tom doy cari en besatne ot tre lntphvoF meoovery-a
least three-overnight batch runs-will need 10 be perlormed before the service is available.

ithout-swing-kit-there-is f itte_th target for POLES. which is difficult-t

tigate-Softek TDMF will be used to provide data replication from Bootle to Belfast, and a DR test of
POLES from Bootle to Belfast is planned as part of the pre-migration activity

9.2.1.3 Archive

Archive is based on the IXOS product and EMC Centera storage. This is a physically separate Centera
from the POL Audit solution.

There are two IXOS servers, one in Bootle which normally runs a DSP database and archives
Production, and one in Wigan which runs the DS database and is used for archive testing of QATest.

Both IXOS servers write to both Centera and the Centera contain identical images. If one is unavailable,
then the missing images are automatically "caught up" once it becomes available again, and there is a
manual process which may be invoked to force this,

Itis therefore safe to move Wigan system, test functionality, and then move the Bootle system. There is a
slight lowering of resilience during this period, but DR from Wigan tape to the IRE19 (migrated Test)
system can be accomplished readily even from tape as it is a very small database.

9.2.1.4 Production: Bootle -> IRE11 Weekend A

Production consists of one server running a SAP/R3 instance PLP which is around 3TB in size
Dialog servers, an XI instance PXI and three XI Dialog servers.

Failover is obviously a little unbalanced as regards Dialog servers, and by the
details may differ slightly, but this has no significant impact on this document.
Softek TDMF is being used to provide "Host Based volume replication” functionality. This will allow the

data to be available at IRE11 fairly instantly, although there is still time required to transfer the server
itseff,

migration occurs the

SCopyright Fujtsu Seniees Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Wesson: 0432

-JulyMayNow-087

[KEYWORDS \"MERGEFORMAT] ale, itu

UNCONTROLLED IF PRINTED.

POL00397094
POL00397094

POL-BSFF-0223764_0067
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

The Production migration may_be_very protracted, up-to 4.5-days-in-total, which even-allowing for two
being ai the weekend exceeds 2 48-hour outage.

‘Once Production is in IRE11 the DR process is identical to that which operated in Horizon. It is intended
that this would be exercised in Weekend A+1, but depending on some of the swing kit and migration
proposals it could be exercised prior to migration. This is BC Test 28.1 in Horizon.

9.2.2 Batch Services - Weekend B

Batch services in Horizon operate on a single Solaris server with an active system in Bootle and a
Standby system in Wigan. The test of the Horizon estate is designed to allow the Batch server to fail over
independently of site DR (BC Test 3 - Database Server).

Batch services are migrating from Solaris9 and Oracle8 or Oracle9 to Solaris10 and Oracle10gR2. There
is a state test for migration that Horizon systems can address batch services via the Oracle10 listener,
but this is a migration issue not a DR issue.

Alll services except APOP and Maestro are being migrated on Weekend B, so although the Horizon
server will remain active, and will continue to be able to fail over Maestro and APOP services as normal,
the remaining services will move to IRE11.

The HNG-xHNG-x batch services will be addressed via a VIP. As part of the migration all Horizon
systems that need to address TPS, APS, LFS, TES, DRS, DW, RDMC or RDDS will be updated to use
this VIP.

Horizon services, and POLFS systems now inRE++-will not be aware (apart from a short outage) of the
Batch services failing over from IRE11 to IRE19, and this DR does not change significantly from BCT3
except for changes associated with Maestro.

ETMS services will migrate during weekend B. Their primary function is to transfer data from DAT to the
‘customer.

During-this-migration the TES Query_ service may-also-move to Belfast, while the Horizon-TES-Query

ly-a DR problem: th red to address th " AP.

9.2.3 Online Services - Weekend C

Many of the oOnline service components in Horizon are stateless and run in active/standby with
automatic failover. This is strictly part of the resilience function not the DR function, but it does mean that
DR of these services (e.g. network banking authorisation agents) themselves is new at HNG-xHING-»

WebSome services such as PAF and DVLA effectivelytrun active/active_across sites with service
preferenceload balancing providing bothas the resilience and DR mechanism in Horizon. At HNG-x these
will run active/active at the primary site with load balancing providing the resilience mechanism, but will
use BladeFrame failover for DR,

‘A number of services, notably NPS and APOP have traditional oracle database type DR, similar to the
batch services.

Itis-not-clear-whether {The MTAS service is-pat MT h_DCSM_and ,
the-Payment./-EMIS- stutf_for.DCS.-which-I-believe-i. Weekend -—However-there-is-an-at for
aff marily “baich’), DBIOCMS4

SESE DCMS secre: tielove oe & mack tera cease in fon (on Desh and be offered
from HNG-x (on EST)

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \\MERGEFORMAT] ate! 2829 ulvidayNov 087

UNCONTROLLED IF PRINTED. Page No: 69 0f 126

POL00397094
POL00397094

POL-BSFF-0223764_0068
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]

FU TSU [SUBJECT \* MERGEFORMAT ]
With the exception of the DCS agent prior to PCI compliance all HNG-xHNG-x online services
‘components reside in the Production LPAN, and site DR is achieved as part of Production LPAN failover.
i" .d-that-the-D: wil whilst it 1 PClcompl

hioved hi i-be-< ed,_-and-a-Praduction- LPAN racic iiLiakerolent
this-does-not-occur-until-many-weeks-after-migration.The Hydra DCS agent will migrate in Weekend D
along with NRA and other Horizon Branch services
During this phase there needs to be a trust between the HNG-xHNG-x AD and Horizon domains.
Bootle/Wigan based systems}
The operation of services in IRE19 or IRE11 will be transparent to systems remaining in Bootle & Wigan.
‘As any service fails over the service address remains identical, (Formatted: Font: Not Italic )

9.2.4 Branch Services - Weekend D

‘The HNG-xHNG-x Branch Service only exists in IRE11 with a DR capability in the Production LPAN to
IRE19.

The Horizon branch services are migrated on Weekend D. Only a small number of systems, notably the
Riposte message store servers, Estate Management systems (ACDB and OCMS) and the Key
Management Server, have significant data associated with them. The remainder, such as Generic
Agents, NWBNBS Routing Agents, VPN servers, boot server and RADIUS servers are stateless.

{DN:-Need.to.check RADIUS-servers--where-do-they get auth trom}

After the Horizon Branch services have migrated they will be referred to collectively as "Hydra" to

‘Once this migration is complete the estate is at the migration end point, and no special consideration is
required post Weekend D.

Following the migration Horizon Branch services operate active/active from Belfast. This is a requirement
of the correspondence server resilience model. The counters are already set up to prefer a particular
correspondence server, and to try others in the same cluster at both sites in a round-robin fashion

‘As each counter moves to HNG-xHNG-. it will prefer IRE11 which is where the BAL is located. This will
be similar as each counter becomes PCI compliant it will prefer IRE11 for DCS and NB Auth traffic

IDN: Need clarification on SYSMAN2 migration, esp. EACRR which is strongly coupled to Hydra]

9.2.5 Audit
The Audit service migration requires careful management and the use of swing kit for EMC Centera.

The Audit system is designed to be down for up to three days, although this is undesirable as it would
affect retrievals. The actual migration will be coupled to Weekend D to minimise traffic flows across the
WAN (a large amount of Riposte audit data is collected each night)

Audit will operate, certainly initially during Transition, as an active/active system and as such DR is not
any different to Horizon, and there is no special consideration required.

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 2342-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED PageNo: 70 of 126

POL-BSFF-0223764_0069
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

9.2.6 Network Management

The Network Management Systems, primarily HP OpenView and Cisco Works, plus some small
diagnostic tools, collect and forward events to SYSMAN and provide backup and recovery services for
switches. There will also be syslog servers for events forwarded from switches.

These systems all operate essentially active/active and there is no special DR consideration,

5 ‘ 5

9.3 DR During Transition

During Transition, the Horizon components that are in use will use the same resilience and DR
procedures as before.

These are:
‘* Correspondence servers
Generic Agents +
*+_VPN Servers
The correspondence servers will be running on virtual: machines.
Generic Agents + (Formatted: indent Fist ine 063 on )
VPN servers.
© Routing agents

(Formatted Billets rd Numero }

* Boot server

kus + (Formatted: Bul and nurberng }

For all systems except KMS tThis will be covered by the active/active PAN, and no special consideration
is required.

KMS retains the current SRDF based database failover with some minor procedural modifications owing
to the fact that a BladeFrame based system cannot interact with Symmetrix, and also that the SYMCLI
version supported by NT4 is not supported with the generation of Symmetrix firmware in Belfast.

‘A component of SYSMAN2 known as EACRR (Enhanced Agent and Correspondence server Resilience
and Recovery) manages services on the Generic agents. Consideration of how EACRR operates during
DR will be described in the EACRR HLD and is not of interest to this design which is completed once the
failed over infrastructure is available to applications.

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \'MERGEFORMAT] —_ Date: 212 -uvMayNov 087

UNCONTROLLED IF PRINTED PageNo: 71 of 126

POL-BSFF-0223764_0070
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10 System Parts
This section details each service. It will cover:

1. A brief overview of the service and its availability requirement

3. An overview of the service criticality and business impact of failover.
4, The HLD reference for this service.

Some of this section will appear to duplicate material that has been presented earlier. This is intentional,
and is designed to make the description of each component easy to read.

10.1.1 Discrete Core Services

This section describes services which operate in an active/active or active standby mode, but are
supported on hardware extemal to the BladeFrame.

IDN: Need to add HLD references to each section?)

10.1.1.1 Network

ARCINET/ARC/0001 - Network Technical Architecture
DES/NET/HLD/0008 - LAN Design
DES/NET/HLD/0009 - WAN Design
DES/NET/HLD/0014 - Branch Access
DES/NET/HLD/0015 - Transit LAN

Brief piece about switches operating in pairs

Bit about WAN and C&W cloud

Brief piece about resilient firewalls,

Brief piece about how Test domains operate as sub-domains (VRF) and in effect Production is a sort of
sub-domain of a Management super-domain.

IDN: There may be some testing issues as V&l wish to exercise control over rig time, but the
‘Management domain and the V&I (Production) Domain have shared authentication and audit services.
Details in V&l HLTP TST/GEN/HTP/0002 ?]

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNov-087
Page No: 72 0f 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0071
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
10.1.1.1.1DWDM Inter-site link
Ref: DES/NET/HLD/0004
The inter-site links are a shared service from Fujitsu Service Corporate Networks. A pair of Dense Wave
Division Muttiplexors are provided at each site, designated North and South. Leased dark fibre from NTL
carries an FTel service to provide diversely routed links of approximately 28km and 48km. EMC have
measured a ping response time of < 3ms which provides considerably lower latency than the similar links
between Bootle and Wigan. Four 4gbps fibrechannel and four tgbps ethemet links are provided for the
SAN and Network switches respectively.
The inter-site link for the Branch DMZ provides the resilient triangulation to protect against loss of C&W.
iink to either IRE11 of IRE19.
10.1.1.1.2Access Switch
The access switch provides a secure location to host VLANs for untrusted traffic. In order to
communicate with a core switch the traffic must exit the access switch and travel via a resilient pair of
extemal firewalls.
Deployed in pairs, one pair per site.
Int {SL_blocked m \ hati finke-be
itch + fail Lact pod to restore ul bot it
The inter-site-link for the Branch-DMZ provides the resilient triangulation to protect toss of CAW-
link-to-citherRE++ oF RES.
10.1.1.1.3Core Switch
Deployed in pairs, one pair per site,
Switch 1 is preferred for traffic (cuts down ISL traffic to track state)
Inter-site ISL_is resilient per pair of switches and is blocked on Switch 2 to prevent spanning-tree loops
means that if inter-site link between Switch 1 fails manual action is required to restore full service
between sites.
DN: Is this still the case?) (Formatted: Font: Italic, Highight ]
In the latest design a second layer (confusingly called the access layer) of Cisco 3750 switches is used to (Formatted: Font: Italic )

provide port scaleability, with the 6513 offering aggregation and routing services, including FWSM and
ACE.

10.1.1.1.4Cisco 2811 Router

This router is used to provide customer-edge (CE) and hand-off (HO) functionality. These routers are
deployed as resilient pairs.

The 2811 only has a single power cord. Banks of 2811 have been deployed in "header" and "footer"
comms cabinets, where each bank is connected to a different data-centreData Centre PDU so that in the
event of a PDU or UPS failure only one bank wil fail

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNow-087
Page No: 73.0f 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0072
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
10.1.1.1.5Cisco ASA5540
This "security appliance" is deployed in pairs to provide resilient firewall layers as required between
access and core switches, and to protect HO and CE routers.
The 5540 only has a single power cord. Banks of 55402844 have been deployed in "header" and "footer"
‘comms cabinets, where each bank is connected to a different dais-cenireData Centre PDU so that in the
event of a PDU or UPS failure only one bank will fail
10.1.1.1.6FWSM Firewall Service Module
‘The FWSM js a blade in the Cisco 6513 switch that allows VLANs to be separated by an enterprise class
firewall
The FWSM resilience is provided by the corresponding FWSM in the paired switch at the same site.
10.1.1.1.7ACE Load Balancer
The Cisco Application Control Engine is a load balancer (similar to the Content Switch Module) which is
provided as a blade in the Cisco 6513 switch.
The ACE works as a pair across sites, with local N+1 resilience being provided by the locally paired
switch
The ACE watches for services being advertised, e.g. a particular server starting a service on port 80, and
advertises a preconfigured virtual IP address (VIP) for that service. Each application owner that requires
a VIP should describe how their service interacts with ACE in the network section of their HLD.
This is the standard method of providing virtual IP addresses in HNG-xHNG-x.
4044.4.4010.1.4.1.8 _[NMN] OpenView : ilets and Numbering )
Ref: /DESINET/HLD/0012
HP OpenView on Sun V890
Active/standby? How does failover work?
‘One per site, although there is also a Test set in IRE19 which could be redeployed in the event of loss of
IRE1
Need to understand importance to service if there is a loss of this service (up to 1 day whi repaired)
following a loss of IRE19.
40.4.4.4.4410.1.1.1.9 [NCW] Cisco Works + (Formatted: Bullets and Numbering ]

Ref: /DESINET/HLD/0012
Hosted on Sun 280R.
Activelactive.

‘One per site, although there is also a Test set in IRE19 which could be redeployed in the event of loss of
IRE11.

Toolset to permit management of Cisco switches.

Primary purpose during DR is to provide a means for support staff to manage the 18 subnets through
pre-scripted tasks.

Copyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyidayNov-087
Page No: 74 of 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0073
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

This task can also be performed manually by a 3rd line Network Support person, but such manual
intervention is likely to be slower and prone to human error,

40-4-4-4-4410.1.1.1.10 [NAP] AlarmPoint #

ullets and Numbering

Ref: /DES/NET/HLD/0012
Discrete because a dial-out line is required.

AlarmPoint is the mechanism by which alerts are raised to pagers held by the second line support teams.
There will be two AlarmPoint servers operating in an active/active mode.

There should also be an LST AlarmPoint server to allow testing of the event raising mechanism.

This is not the primary means of notification, but serves as a way of alerting support staff to serious
events.

‘One per site, although there is also a Test set in IRE19 which could be redeployed in the event of loss of
IRE11.

IDN: Horizon had three BTI systems to keep N+1 in the event of data-centreData Centre failure. Should a
third system be based in BRAO1?]

40-4-4.4,.4310.1.1.1.11 [NPC] Network Packet Capture + (Formatted: Bullets and Numbering

System to capture packets for traffic analysis and diagnosis. These are deployed ad hoc to resolve
problems. Two are provided per site simply so that more than one problem may be analysed at the same
time.

Ina switched network environment traditional sniffers have a hard time. Cisco allow switch ports to be set
to a diagnosis mode to capture traffic, and tools such as WireShark or Ethereal allow analysis of the
captured packets.

Most sensitive packets on the RMGA estate already have the data payload encrypted, and this tool really
only allows for analysis of flow and protocol problems, such as nfs mounts not succeeding.

40-4-4-4,4610.1.1.1.12 [NTP] Network Time Protocol + (Formatted: Bullets and Numbering

Ref: /DES/NET/HLD/0013
A pair of Galleon NTS6000 servers are deployed at each site at ntp stratum 0.

These are able to take a time source either from GPS satellites or from the MSF time signal. The clocks
can vote amongst themselves to provide an accurate signal, or indicate to a client whether they have a
poor time. The ntp client in Solaris, RedHat and the Cisco IOS is able to handle this gracefully

For Windows platforms the ACD platform will act as a stratum 1 time source.

40.4.2.3.610.1

.13___ [VPN] VPN Servers + (Formatted: Bullets and Numbering

Activelactive.
12 per site allow each site to be independently N+1 resilient.

‘The VPN servers are really part of the network infrastructure, but rather than being hosted on appliances
they are physically separate NT4 servers.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyidayNov-087
Page No: 750f 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0074
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]
These will be deployed on four RX300 servers [VSD Platforms] running Microsoft Virtual Server Host,
each hosting three NT4 VPN quests. They are discrete to allow connection to the Access switches.
The service is required for as long as there are NT4 counters in the estate which may be long after the.
Horizon application has disappeared.
Active/active. No failover.
Recovery by reprovisioning.
[DN CR0030 ie di the retention of VEN pastthe- Hydre phase]
10.1.1.1.14 [VPM] VPN Policy Manager + — (Formatted: Sulets and Numbering )

IDN: need some words]

Active/active. No failover,

Recovery by reprovisioning.

Deployed on a shared VSD with VDW and VEX

10.1.1.1.15 [VDW] VPN Loopback Workstation

ullets and Numbering

Active/active. No failover.
Recovery by reprovisioning.
Deployed on a shared VSD with VPM and VEX

+ (Formatted: Normal

10.1.1.1.16 VEX] VPN Exception Server + (Formatted: Bullets and Numbering

Active/active. No failover,
Recovery by reprovisioning,
Deployed on a shared VSD with VOW and VOW,

10.1.1.2 Storage Area Network

Ref: DES/NET/HLD/0007 SAN High Level Design
Ref: DEV/INF/ID/0003 SAN Configuration Document
Ref: DEV/INF/LLD/0043 Storage Mapping Document

10.1.1.2.1[SAN] MDS9509 SAN Switch
Ref: DES/PPS/HLD/0007 Storage High Level Design

There will be a pair of MDS9509 switches at each site operating as a mirrored fabric. Both switches make
Use of both inter-site links, North and South

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Wesson: 0432

Date:
Page No: Teor 126

UNCONTROLLED teRiNTED [KEYWORDS \* MERGEFORMAT]

POL-BSFF-0223764_0075
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Muttiple line cards, multi-pathing, alias based zoning,
\VSANSs will be used to segregate logical sets of data, primarily for performance reasons.

LUN masking and internal BladeFrame LPAN presentation will ensure that Test and Production disks are
only presented to the correct server instance.

10.1.1.2.2[RSG & RGP] EMC Remote Support Gateway
Ref: DEV/INF/LLD/0030

The EMC Remote Support Gateways (RSG) are on discrete RX300 as they face the internet. They
provide a secure means of access for EMC engineers (using RSA SecurlD tokens) as well as providing a
generic mechanism for alerts from EMC equipment to be sent direct to EMC. This allows a
response, e.g. to replace a failing disk, usually before any application is aware of the pending failure.

There are two gateways, one per site, operating in active/active mode. If support is required during a DR
EMC must have a means of providing timely support.

The EMC Remote Support Policy Server (RSP) authenticates connections, but as the gateways have a
short memory of recent connections it is not a service that requires high availability.

POL00397094

Ret: (DEV/INF/.LD/0029

EMC Enterprise Control Centre provides a centralised service for SAN and storage management. The
ECC Server itself supports a database and a data collection service, and wil fail over in DR.

There are a number of data collection and control agents. To unload the ECC Server and provide a level
of performance scalability and resilience more than one ECC Agent server is usually deployed.

These services are not critical either to normal operation or to failover as other systems will alert upon
failure, e.g. OpenView will report events from SAN Switches, and the attached servers themselves will
report disk failures, but they allow a more effective diagnostic response, and they considerably simplify
the SAN and storage management tasks.

1 Storage Arrays
Ref: /DEV/PPS/HLD/0007
Ref: DEV/INF/LLD/0004

EMC DMX.3 storage arrays are used to provide storage service classes 1 and 2. There are two
pairs, A and B, one of which hosts BRDB and the other the Standby so that a storage array fault is
unlikely to compromise the ability to offer a Branch service.

EMC Clariion CX3-80 is used for storage classes 2 to 6. There is only one system per site, but for
these storage classes by definition the data is not mission critical and may be recovered from backup, or
by holding separate copies on the Clariion at each site.

EMC Centera CAS is used to store Audit data and POLFS Archive data. These are existing
‘solutions migrating from Horizon and are better covered in the sections on ARC and IXO platforms.

10.1.1.3 [PAN] BladeFrame PAN Manager
Ref: DES/PPS/HLD/0025 Bladeframe High Level Design

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Wesson: 0432

Date:
Page No: Trot 126

UNCONTROLLED IF PRINTED. TXEYWORDE: 'Y MERGEFORMAT

POL00397094
_-( Formatted: Portuguese (Brazil) )
(Formatted: Heading 5 )
(Formatted: Bullets and Numbering )

POL-BSFF-0223764_0076
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Ref: DEV/GEN/SPE/0007 Platform Hardware Instance List
Ref: DEV/INF/LLD/0066 BladeFrame Failover Low Level Design

The PAN Manager service permits resources to be allocated to logical groups known as LPANs.

Each PAN Manager service is hosted on a pair of control blades (cBlades) which provide a service
address for that frame, and also provide resilience for the PAN connections to LAN and SAN. These
Blades are managed as appliances and are replaced as units by the hardware support engineer.

There are four BladeFrame chassis at each site. Three of these at the primary site normally operate the
Production LPAN. Three equivalent chassis at the secondary site are available for the Production LPAN
to fail over to, and normally support a number of Test LPANs. The fourth frame at each site is
permanently assigned to Production use for active/active systems, and is expected to host Hydra
systems running in virtualised environments, for instance, as well as services such as SAS, AD and DNS.

The ‘frames operate in pre-designated pairs, so services in bf002 will always fail over to bf001 etc. The
LPAN definitions on bf001 and bf002 are identical with the exception of VLAN ID's for VLAN tagging. It is
thus possible to deploy changes to the LPAN configuration at the secondary site before applying them to
the primary site.

In principle it is possible to operate Production and Test services concurrently, but in practice there are
unlikely to be enough pBlade resources to support this.

Certain basic design rules are described in the HLD to maintain optimum resilience, for example:
Cluster members should not share a power domain
Cluster members should boot from diverse EMC cabinets
Services such as BAL which provide N+1 resilience should boot from diverse EMC cabinets

Dual cBlade failure is equivalent to the loss of that chassis, although there are circumstances where the
PAN Manager service may be lost but the I/O virtualisation functionality continues. This need not
necessarily result in a DR. A cold reset of the oBlades takes about 20 minutes, and is the primary
recovery mechanism. This would happen in parallel with the escalation on loss of service.

The cBlades may be rebooted in tum, a process known as a "rolling reboot", in order to reset them e.g. in
the event of major storage layout changes or after applying a patch. A rolling reboot does not impact on
Server operation, and is an allowable operation during normal working hours.

10.1.1.4 [VSH][VSD] Virtual Server Host
Ref: /DES/PPS/HLD/0004

It is getting very difficult to source equipment that will run NT4SP6A. To work around this a dummy
platform with W2003 Enterprise Edition has been created to allow NT4SP6A instances to run under
Microsoft Virtual Server Host.

This also permits several NT4 services to be hosted by a single platform. The majority of the hosted
services are over five years old and have a relatively low memory and CPU requirement.

The Correspondence Servers have a fairly high 1/0 profile, but this stil allows a smaller service such as a
domain controller to be co-hosted.

This platform is available both for BladeFrame hosted and discrete systems. The discrete is known as

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
[KEYWORDS \* MERGEFORMAT] pee 2B125-JulvMayNov-087

UNCONTROLLED IF PRINTED. Page No: 78 of 126

POL00397094
POL00397094

POL-BSFF-0223764_0077
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
10.1.1.5 [BSM, BSS, BSL, BSW, EDL] Backup
Ref: DES/SYM/HLD/0015
The backup servers are active/active and do not perform DR
The Solaris backup servers have EMC SYMCLI and NAVICLI installed and perform the additional
function of storage management servers. Any scripts required to fail over storage will be run from these
servers.
The Net Backup Master Catalogue Service needs to fail over in order to permit restores. This would be
simplest if it was in the frame, but the current plan is to present an SRDF disk to the RHEL Media Server
which will also run the master service in active/standby mode.
IDN: A-CP0048 hais beening raised to move the master service into the frame on a separate platform
[BSM] as the current DR solution is not supported by Symantec]
10.1.1.6 [MSH] Hydra Maestro Master
Ref:
A small Solaris server is required to continue running maestro as the TWS will not support NT4 and the
old version of maestro does not run under linux.
This will be a SunFire V125 server which mounts a single SAN disk as /opmaestro with a standby server
in the secondary data-centreData Centre. The same RL2 / RL3 as the existing Horizon batch solution
Uses is an appropriate design for making a platform active.
10.1.1.7 [ENT] Hydra RSA SecurlD Server
Ref
This server provides two-factor authentication for Horizon systems which will not be able to join the AD
domain, and therefore will not be able to use the Vintella two-factor authentication.
There is a simple master/slave relationship with one server at each site, although later versions run as
peers.
Whether thi the_He arie-2.6 aded_for-th
remainder-of Horizon is-not clear, but i-is-difficult to. get-servers-which-support Solaris 2.6, and-a-hybrid
solution-with-a-slightly-updated-ACE-server-_on-Solaris 10-has-been-piloted-in-INF1This will be_a
continuation of the Horizon solution on Ultra10 and Solaris 2.6. Retired systems will be retained to
provide spare parts.
ver + (Formatted: Bulets and Numbering )

Ref: /RS/MAN/O13
Thi vat databe i dows NTA " cee

ate backed-up. There is-a-securty-concem over che. poceiblethofof ine- databace, and-ae-2-recul cold

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432

Date: -JulyMayNow-087

[KEYWORDS \"MERGEFORMAT] ale, But

UNCONTROLLED IF PRINTED.

POL-BSFF-0223764_0078
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
‘One of the most important uses for these keys is to encrypt the counter message store. The complexity of
id has been che big drivers for HNG-x,
unless-a-software-equivalent-is-approved_(the-company_that-makes_the-hardware-RNG-has- stopped
Itis doubtful whether EMC Sol Frabler (SVMCL1} will work insid H platform, and
the DMX._The_B lI_probably_need-te_be_worked. d-in-ore Hh
tart ._the fail Jedure—wil_need_to-be_modified_io_int ith the Si
M " BSS, the Solaris Back
Resilience: fail over to secondary.
40.4.4,910.1.1.8 [EBS] Enterprise Boot Server - (Formatted: Bullets and Numbering )
Ref: /DES/IPPS/HLD/0024
The Enterprise Boot Server provides kickstart, jumpstart and PXE boot services during initial builds. It is
ot required to be available except during builds, and the most straightforward model is to simply have
one at each site, active/active, and simply point builds at the preferred system.
{DN: bootp is not clear]
Itis a manual build (it is the very base system in the provisioning solution), although in principle rebuilds
could be via TPM, but that probably would require maintaining two platform types in Dimensions.
The current design seems to-be that each service (System Test, SV&l,RV,LST,V&llProduction)-has-its
own EBS.
The BladeFrame management interface needs to mount a share from the EBS to perform kickstart builds
for RedHat.
40.4.4.4010.1.1.9 _ [NAS] Networked Storage ~ (Formatted: Bullets and Numbering }

Ref: /DES/PPS/LLD/0004

EMC GeleraCelerra is used to present SAN storage as network shares. In the present configuration this
storage is hosted on the EMC Clarion CX3-80 and replicated between sites by MirrorView. On failover
the MirrorView failover occurs first, and the slave GeleraCelerra at the secondary site then takes over
presenting the shares.

As with all synchronously replicated storage this does not protect against corruption, and any data stored
here should be backed up as necessary. This is straightforwardly achieved either via Clarlion SnapView
replicas presented to a backup server over the SAN, or for small repositories as a network backup.

The EFS shate a repository with EPM. There is one huge read-only repository shared amongst all EPM
instances (one per tig), plus a smaller rig-specific repository. These are delivered to via DXC by the
Configuration Management Workstation as CM pass DPVB's to TPM for onward distribution,

The Branch Database nodes also use NAS to avoid the need for a clustered files ystem, e.g. to write
audit data. Any node is able to write data, and the Audit Gatherer service can simply be pointed at the
share rather than having to trouble a node for the data

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNow-087
Page No: 80 0f 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0079
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.4.4.4110.1.1.10 [CON] Aurora Console Tower + I Formatted: Bullets and Numbering

Ref: /DES/SYM/HLD0020

Aurora is used to manage all (serial) consoles for equipment such as cBlades, Solaris servers and Cisco
switches in a controlled-andloggedsecure and auditable manner without physical access to the Data
Centre being required.

There will be two Aurora systems at each site with interfaces on the management LAN, each managing
complementary equipment, e.g.

Aurora manages bf001/cb1

Aurora2 manages bf001/cb2

‘Aurora manages core? network switch (from mgmt LAN on coret)
‘Aurora? manages core’ network switch (from mgmt LAN on core2)

‘Aurora1 manages Aurora? and vice versa
‘Aurora connectivity is typically only used during "dead server" type recoveries, and is not required to be

highly available, but is very useful for looking at the logs to see what went up the screen as the system
died

In the event of major disruption preventing access site access will be requested by the local Unix Support
team who will gain emergency access via the Aurora physical console port until general connectivity is
restored. This has never been required in Horizon.

There is no DR requirement for Aurora itself, but itis a critical component in Solaris DR to allow properly
managed reboots. Emergency re-patching will allow a continued service for a limited number of servers.

40:4.4.43 [DXC] Corporate Data Exchange Proxy +—( Formatted: Bult and Numbering

Ref: /DES/NET/HLD/0018
ActivelActive-One per site,

Pativite- astervanstor oPdaket ‘ its Se NG x neice wegcsoh .
oawodl fron Dk boing trancterred io the TPM Repost
40.4.4.4410.1.1.11_[DXI] Internet Data Exchange Proxy + (Formatted: Bullets and Numbering

Ref: /DESINET/HLD/0017
Active/Active. TwoOne per site, Secure Appliance WebWasher 1150.

Permits safe transfer of data from the Intemet to the HNG-xHNG-x network and vice-versa, e.g. software
packages released from Dimensions being transferred to the TPM Repository.

40-4-4-1510.1.1.12 [SPS] Supplier Access Server + (Formatted: Bults and Numbering

Ref: /DES/SYM/HLD/0017
This is a solution to allow support by third parties, e.g. Fujitsu Siemens or Oracle.

Not-well-defined-but-basically-aSAS-to-let-3rd_party_suppliers-have-managed-access.A CP has been
raised to move this from BladeFrame to discrete as it needs to sit in the internet facing Access layer of

the network.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY

20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNow-087
Page No: 81 of 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0080
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

_—( Formatted Font Tale Highigkt
~ (Formatted: Font Ralic
+ (Formatted: ules and Nunberng

+ Formunted: bulets and Numbering )

40.4.4.4810.1.1.13 [HSM] Hardware Security Module * ullets and Numbering )
Ref: /DES/SEC/HLD/0002

Atalla AS10150 appliances replace the PCI cards in the Network Banking Agents.

There will be four-three per site deployed active/active which caters for both site DR and resilience.

OneThree_appliances will be deployed in IRE19 for Test services, which may reuse. The Atalla is primed

bya key disk and these differ for Test and Live services.

There will also be smaller AS8150 in BRAO1 and LEW02 for key generation,

40.4.4.4910.1.1.14_ [VNS] Vulnerability Scanning Server + — (Formatted: Bullets and Numbering )
Ref: /DES/SEC/HLD/0008

Foundstone FS1000-management system. plus-wo-FS850-per-site,, one per site.

Each site is effectively an independent deployment.

There is not a very high availability requirement on these systems. As long as vulnerability scans are

performed in a reasonably timely manner that is sufficient, so recovery by replacing with a spare is

adequate

The FRE'9 set wal be pert of LST fn normal ckoumstancee, ond be reconfigured in the event ofloohig

The definitions of scans will be backed up so that they may be recovered in the event of a spare being

provided

IDN: Not clear how these are provisioned in the first place]

40.4.4.2010.1.1.15 [DAT] Legacy Batch Server + (Formatted: Bullets and Numbering }

Ref: [Platform HLD]
Ref: /DEV/INF/LLD/0065 Solaris Failover LLD

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version:
Date:
Page No: 82 0f 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0081
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

The DAT service is effectively hosted by a four node cluster of PrimePower PW6S50 servers, giving N+1
resilience at each site. The same model is used as for the POLFS XI Server as described in
EA/DES/001. The data is shared via SAN storage, which also enables site failover.

Cisco ACE detects which of the four systems has started an Oracle listener and advertises a service VIP
‘on behalf of that server. Only a server with access to the disks will be able to advertise.

The server hosts a number of services including eight Oracle databases and the master TWS scheduli
service. These are discussed individually in the following sections.

All four servers need to talk to ntp and ssh on the SAS for management support, possibly also NBU client
and TCA.

‘Audit gathering is from each application not from the underlying platform.

Not sure how DNS is configured for this as an overall service. In effect we have a four node oluster
offering a single virtual service.

Because of the use of ACE there is the possibilty that if the storage is "split" rather than failed over, such
as may be done during a major migration to preserve a regression image, that both sites will offer a
service. Procedures for splitting the data centreData Centre should ensure that the listener is inhibited.

For this reason also it is recommended that the default run level be made 2, and the service start up be a
manual event.

IDN: This has an impact in situations where the server reboots, e.g. loss of a processor]
{DN:.will there-be-@ HNG-x Dimensions reference forthe POLFS- resilience design?]

10.4.4.20.110.

TSH Tivoli Workload Scheduler + (Formatted: Bullets and Numbering

Ref: /DES/SYM/HLD/0016

The master Tivoli Workload Scheduler will be hosted on DAT as at Horizon. This has two advantages;
firstly this is a resilient platform, and secondly most of the scheduled jobs actually run on DAT anyway,
and much of the ancillary scripting from Horizon can be simply redeployed.

40-4-4-20.210.1.1.15.2 DRS + (Formatted: Bullets and Numbering

Ref: /DES/APP/HLD/0033
Data Reconciliation Service. Actually two services in one system, NWBNBS and DCS.

Banking and Debit Card have different settlement periods, Debit Card being next working day, and
Banking being same day.

Collects C12's

Produces CAPO and A&L REC files

Receives LREC file from Link

Produces Payment file for Streamline, and receives EMIS file.

Produces banking reports

40.4.4.20.310.1 3 TES - (Formatted: Bullets and Numbering

Ref: /DES/APP/HLD/0036
Transaction Enquiry Service
Supports queries from POL users in Huthwaite via TESQA.

SCopyright Fujtsu Services Lid [SUBJECT ¥ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Vesson: 0

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED.

POL-BSFF-0223764_0082
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Also provides C4/D feed to DRS that used to come from IBM NBE.

40.4.4.20.410.

TPS + (Formatted: bulets ond Numbering

Ref: /DES/APP/HLD/0027
Transaction Processing Service.

Harvests EPOSS transactions.

Provides feeds for POLFS (BLE) direct via NFS

Provides POL-MIS (W_jjinnn.TP_.pzR) via TIP FTMS Gateway

Provides NWBNBS feed to DW via local file system.

Provides C112 data to DRS for reconciliation

Provides Client Transaction Summary (CTS) file to POL (W_jji500.TP_.pz)

40.4.4.20.510.1.1.15.5 APS * (Formatted: Bullets and Numbering

Ref: /DES/APP/HLD/0026
Automated Payment Service

Harvests APS transactions.

Provides files to clients via EDG and a client transaction summary (CTO) to POL-MIS,

40.4.4,.20,610.1.1.15.6 LFS + (Formatted: Bullets and Numbering

Ref: /DES/APP/HLD/0037
Logisitical Feeder Service
AcThis ‘unction could probably have been incorporated into-the BranchDB.at HNG-x. (but ithasn't been)

Acts as the interface between SAP/ADS and Riposte-the Branch estate for planned orders, cash on
hand, pouch transfers etc.

All communication with ADS is via TIP FTMS Gateway.

10.1.4.20.710.

DWH + (Formatted: Bullets and Numbering

Ref: /DES/APP/HLD/0082
Data Warehouse.

Used to be a big SLA calculator. This is the only system in the estate truly designed to run
asynchronously, and can catch up independently provided the input files have not been lost. Used to be
on a separate platform but was consolidated at S50.

‘At Horizon is used to calculate some message delivery SLAs, but these really disappear at HNG-xHN
(not at Hydra)

Produces some banking reports that could probably come from DRS._I think these are likely to be
removed

‘Some "extra" functionality at HNG-xHNG-x but details not yet clear.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNow-087
Page No: 84 of 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0083
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.1.20.810.
Ref: /DES/APP/HLD/0004
Ref: /DES/APP/IFS/0004 HNG-xHNG-x Ref Data Delivery

RDMG + (Formatted: Bullets and Numbering )

Reference Data Management Centre.
Receives updates from POLIRDS. These are tested on the RDT systems and then marked for release
using the RDMC Workstation.

RDMC is also used for memos to PMs and for “urgent” ref data such as Bureau de Change rates which
bypass the RDT checks.

10.1.1.20.910.
Ref: /DES/APP/HLD/0005

Ref: /DES/APP/IFS/0005 (HNG-xHNG-x Counters)
Ref: /DES/APP/IFS/0001 (BranchDB)

Ref: /DES/APP/IFS/0002 (BranchDB)

Ref: /DES/APP/IFS/0003 (SYSMAN)

RDDS +—[ Formwted: bles ond Nonberg )

Reference Data Distribution Service

Copies released data from RDMC and transforms it to be suitable for all client systems. Many clients read
the data via ODBC database links, DW gets input files, and a Loader is run for Riposte.

Branch Database will probably use ODBC.

40.4.4.2410.1.1.16 SYSMAN2 * (Formatted: Bullets and Numbering )

‘SYSMAN2 is the generic term for the Horizon version of Tivoli estate management. NT4SP6A will not run
under the latest version of Tivoli, so SYSMAN2 must be retained for as long as there are NT4SP6A
platforms requiring Tivoli management.

The resilience model is unchanged from Horizon.

It is expected that the SYSMAN2 Primary site will be IRE19 to continue the model established in Horizon __—{ Formatted: Not Highlight )
that in the event of failure of the primary site (IRE11) SYSMAN2 is already available to manage the Hydra

services at the secondary site, —( Formatted: Not Highlight )
[DN Not this should be stated-but itis to-allow-t te a
tie sveneoe lounge te for th dercFthe-service!

Events may-will be collected direct to SYSMAN3 from the counters to give a single management view of
the branch estate, Retiring servers managed by EACRR will direct events to SYSMAN2,

DN: Not clear whether this will happen} __—(Formatted: Font: Not Italic )

40.4.4.24.410.1.1.16.1___ [OMD] Inventory Server + [ Formatted: Bullets and Numbering )
Active/standby Oracle database server.

‘The OMDB database is critical to the management of the Horizon estate,

‘Copyright Fujisu Services Lid [SUBJECT \° MERGEFORMAT I Ref TDOCPROPERTY

20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date:
Page No: 85 0f 126

UNCONTROLLED IF PRINTED. TXEYWORDE: 'Y MERGEFORMAT

POL-BSFF-0223764_0084
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

This_is_an SRDF based Oracle database, failover similar to DAT and KMS. It is discrete to permit
continued use of BCV backu;

[DN:Itis-not yet clear whether all of the services listed below are required in Belfast]

40-4-4-24-210.1.1.16.2 EACRR Enhanced Agent & Correspondence Server I Formatted: Bullets and Numbering

Resilience & Recovery

Ref: DES/SYM/HLD/0007 + (Formatted: Normal

This is a service effectively hosted on OMD which filters and interprets events to provide resilience in the
agent services hosted on the AGE platforms.

‘Some of the services which EACRR detects or manages are moving to the SYSMAN3 domain. __-{ Formatted: Font: Italie

Counter events from Hydra counters will go direct to SYSMAN3, but the data centre systems, will continue
hod-by-which-SYSMAN2and (Formatted: Font Not Rake

[OMASDC] Domain Controller * (Formatted: Bullets and Numbering

Formerly the OMDB Archive server this platform now provides only the Domain Controller+——{ Formatted: Normal

function for SYSMAN2.
Primary in IRE19, Backup in IRE11, standard NT4 model.

DELSDS] Delivery Server

Not clear whether this interacts with the NAS. Repository or whether two. delivery streams-continue, one
for Horizon and-one for HNG-xretiring platform,

10.4.1.20.510.1.1.16.5 ‘SMR] Master TMR

40.4.4.24.610.1.1.16.6 [SMT] Master TEC + Formatted: Bullets and Numbering

40.4.4.24,710.1.1.16.7 [SCT] Client TEC

40.4.4.21,810.1.1.16.8 [SEC] Expedited TEC

Required as long as EACRR is needed,

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Wesson: 0432

Date:
Page No: Bb or 126

UNCONTROLLED IF PRINTED. [KEYWORDS \* MERGEFORMAT]

POL-BSFF-0223764_0085
[TITLE \* MERGEFORMAT ] 2
FUJITSU [SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

10.1.1.21.9[SNT] SNMP TEC.

10-4-4.24-40[SRT] RADIUS TEC
10.4-4.24.4410,

9 _ [SST] $-TEC
40.4-4.24.4210.1.1.16.10 [LGW] Login Gateway
40.44.24.4310.1.1.16.11_[SPGW] Post Office Gateway

40.4.4.24.4410.1.1.16.12 [SSG] Secure Post Office Gateway
Boot loader DMZ

+—(Formuited: bulets and Numbering )

+——(Forrntiod: Norra )

40.44,24.4510.1.1.16.13_[TGW] Campus Gateway + (Formatted: Bales ard Numbering )

10.1.2 Services in Active LPAN
Ref: ARC/PPS/ARC/0001

These are services which require an active/active DR model, or which use master/slave replication, and
particularly those services which are required to enable DR of the Production LPAN

Modelling of these services in Test may be problematical, and some e.g. any providing storage
management or network security functions would not be modelled in Test.

10.1.2.1 [ARC] Audit Server
Ref: /DES/APP/HLD/0030 (Gathering)
Ref: /DES/APP/HLD/0029 (Retrieval)

‘SQL-Server on Windows 2003.
Horizon (and therefore Hydra) has separate gathering at the primary and secondary site to distinct EMC
Centera CAS Arrays, with a separate index database at each site on the audit server.

Until all Horizon data has expired, a period of seven years after the final counter migration to HNG-

xHING-« plus the time to resolve any outstanding court cases, there is not really an opportunity to
redesign the service.

40.4.4.4210.1.2.2 [SPN] Metron Athene
Ref: /DES/PER/HLD/0022
Performance & Capacity management reporting.

+ (Formatted: Bullets and Numbering )

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23423-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED. Page No: 87 of 126

POL-BSFF-0223764_0086
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

The database is stored on Class 2 storage, and is replicated by the application. Loss of this service would
impact on the ability to provide monthly reports.

In general gaps in the performance history are not considered important.

40.4.2.210.1.2.3 Support Services - (Formatted: Bullets and Numbering

10.1.2.3.4[DXC] Corporate Data Exchange Proxy (Formatted: Heading S

Ref: /DES/NETHLD/0018 ls and Numbering

Active/Active, One per site,

Permits safe transfer of data from corporate networks to the HNG-x network, e.g. software packages
released from Dimensions being transferred to the TPM Repositor

40.4.2.2.410.1.2.3.2 [SAS] Horizon Secure Access Server *

ullets and Numbering

Provides a secure point of entry into the estate for the support of Horizon systems.

Two in each data-centteData Centre Support DMZ in activelactive state for core support staff.
Resilience model is to use one of the other servers.

9.

Users logged into SAS are authenticated with their appropriate role against PWYDCS which is trusted b:
Hydra systems in BOPSS and WOPSS.

{DN:-1-am-not really sure-why-we-still need thie}

Recovery is by re-provi

404.2.2.210.1.2.3.3[SSN] HNG-xHNG-x Secure Access Server + (Formatted: Bullets and Numbering

Ref: /DES/SYM/HLD/0017

Support Access Server. Provides secure point of entry into the estate for support staff.
Two in each data-centeData Centre Support DMZ in activelactive state for core support staff.
in each POL DMZ in activelactive state for SAP Basis support staff.

nce Model is to use one of the other servers.

Recovery is by re-provisioning.

40.4.2.2.310.1.2.3 4[DNP] & [DNS] BIND + (Formatted: Bullets and Numbering

Ref: DES/NET/HLD/0006

Traditional DNS implementation with primary + secondary in IRE11 and IRE19.

Primary active/standby.
‘Secondary activelactive.
‘GCopyight Fujisu Services Lia TSUBIECT ¥ MERGEFORMAT] Ref TDOGPROPERTY
20087 “Document Number” \*
MERGEFORMAT ]
Version: 0.432

Date:
Page No: 88 of 126

UNCONTROLLED IF PRINTED. TXEYWORDE: 'Y MERGEFORMAT

POL-BSFF-0223764_0087
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Lookup services will be available during failover, but update services will not be available until the primary
is failed over.

404.2.2.410.1.2.3.5[ACD] Active Directory/DNS +

ullets and Numbering

Ref: DES/PPS/HLD/0003

This platform also provides the two-factor authentication service as described in DES/SEC/HLD/0001.
It may also provide the Certificate Authority Service [CAN]

FSMO Roles are only active on one server at a time, so during DR basic authentication is available but
updates will not be until the FSMO Roles have been transferred.

The interaction between Hydra NT4 Domains and HNG-xHNG-x AD Domain is not yet clear. Most of the
data transfer is by agent services logging in as Oracle database users which does not require any form of

trust.
10.1.2.3.6NRS] RADIUS servers (Formatted: Highlight ]
Ref: /DES/PPS/HLD/0011 (Formatted: Bullets and Numbering )

Ref: /DES/NET/HLD/0014
Ref: DEV/INF/LLD/0077 (CP0184)
RADIATOR RADIUS Service.

P0184 changes the layout from each service having its own platform to a single platform operating as a
load balanced pair at each site, running services BR-ADSL, BR-WWAN, BR-ISDNIN, BR-ISDNOUT

ACE RADIUS probes will be configured to test the authentication of the RADIUS servers within the server
farm and control whether a particular RADIUS instance is functioning and hence be made available within
the server farm. In the event that no servers are available for a given instance, the VIP will not be
advertised through routing on the MSFC and the corresponding data center VIP will service requests,

10.1.2.3.7JNRM] TACACS _—{ Formatted: Highlight
Ref: DEV/NE/LLD/OO77 +S [Formatted: Bults and Numbering
“\ (Formatted: Highlight
CP0184 introduces a resilient authentication service for network management. This is based on NS —
wo platforms per site, N+1 at each site, on a similar model to DNS or AD. _ [formatted Not Highlight

\ (Formatted: Normal

10.1.2.3.8[IPS] Intrusion Protection System Management Server (Formatted: Not Highlight

Refi st Formatted: Bullets and Numbering

‘Two IntruShield 3000 probes are deployed at each data centre for Production use. There is a

Management platform at each site based in the active LPAN, either one of which may manage the
service in a similar model to firewall management.

10.1.2.3.9[NFM]Firewall Manager + — (Formatted: Bullets and Numbering
Hosted on BladeFrame
‘DCopyright Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432

Date: -JulyMayNow-087

TKEYWORDS \"MERGEFORMAT] Palen. Boor a6

UNCONTROLLED IF PRINTED.

POL-BSFF-0223764_0088
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
The firewall manager service is only required in order to update firewall rules. This is_a relatively
‘occasional requirement, and should certainly not be required during a DR.
10.1.2.3.10 [SYS] Syslog Server + (Formatted: Bullets and Numbering 5)
Ref: /DES/NET/HLD/0012
Active/active. All systems write to both syslog servers. The syslog servers will forward interesting events
to Tivoli via NetCool probes, but itis likely that any events will have already generated an SNMP trap via
OpenView.
‘The Branch Routers will also report to the syslog servers, so these are fairly high performance and run a
dedicated syslog daemon in addition to the standard linux one that manages the platform.
Interesting events have already been sent to SYSMANS by the time the server fails. Opposite site system
continues. In the event of prolonged site failure a second system would be provisioned.
40.4.2.310.1.2.4 Branch Access - Hydra + (Formatted: Bullets and Numbering )

10.1.2.4.1-[KMS] Hydra Key Management Server

Ref: JRS/MAN/O13

This is a SQL-Server database running on Windows NT4 in a special security domain.

‘The database and any required file store are on the Key Management Server S: drive which is on EMC

SRDF replicated storage. BCVs are not used, SQL-Server does a dump to disk and areas of the S: drive

are backed up. There is a security concern over the possible theft of this database, and as a result cold
jackup images are rarely mad

One of the most important uses for these keys is to encrypt the counter message store. The complexity of
doing this and the need to recover transactions from “dead” counters that may not have replicated to the
‘correspondence servers has been one of the big drivers for HNG-x.

‘The KMS has a hardware random number generator, which mandates that it be outside the BladeFrame
unless a software equivalent is approved (the company that makes the hardware RNG has stopped
because it believes the software one is better, but the approval process is complex).

Itis doubtful whether EMC Solutions Enabler (SYMCLI) will work inside a VSH platform, and in any case
the version of SYMCLI that was compatible with NT4 is not compatible with current EMC _microCode
versions on the DMX. The Booter service will probably need to be worked around in order to allow
service startup, and the failover procedure will need to be modified to interact with the Storage
Management Server (BSS, the Solaris Backup Server).

Resilience: fail over to secondary.

40.4.2.3.410.1.2.4.2[COR] Correspondence Server

(Formatted: Bullets and Numbering

Ref: DES/PER/HLD/0003 Branch Trading Resilience HLD.
Ref: [DN: Need a ret for redesigned EACRR]

Ref: SY/SPG/002

Active/active with local N+1 resilience. No failover.

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
23423-JulyMayNov-087

. Date:
UNCONTROLLED IF PRINTED TKEYWORDS V" MERGEFORMAT] ——PageNo: 80 of 126

POL-BSFF-0223764_0089
POL00397094

POL00397094
[TITLE \" MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]
Correspondence Servers host the Riposte Message Store distributed database, of which each counter is
also a member.
The total message store has been manually split into four clusters, each of which contain approximately
‘one quarter of the branch estate (both in terms of size and performance). Split is actually nearer 30%.
30%, 20%, 20%
Each message store is supported by four servers, known as neighbours, two at each site. In Horizon one
‘server at each site uses EMC storage to allow BCV backups, and the other is on Compaq RAID array in
case of EMC failure. In HNG-xHNG-x one server will use DMX-A and the other will use DMX-B or the
Clarion.
This is analogous to Branch Database resilience, except that Riposte has all four members active,
whereas Branch Database is active/standby at the primary site with a failover delay to the secondary site.
This reflects the higher Horizon availability requirement for PAS/CMS which is now defunct.
The initial build will be from a .VHD file captured from the Horizon system. A rebuild will start by re-
applying the . VHD followed by re-provisioning of any fixes.
If a rebuild is required there is a procedure for recovering the message store either from a backup or by
replication from a surviving neighbour.
40.4.2.3.210. 3 [AGE] Generic Agent + (Formatted: Bullets and Numbering )
Ret-{DN:-Need-a-ret for redesigned EACRR]
Ref: DES/PER/HLD/0003 Branch Trading Resilience HLD. ~ (Formatted: Font: Arial )
The generic agent servers (which are generally known as agents) run services (confusingly also known
as agents) which allow messages to be passed between Riposte and the back-end databases.
There are also streams running in the daytime maestro batch schedule, such as pouch delivery, which
use the bulk load agents to turn LFS into a sort of online system with high latency, so it is not simple at
Horizon to look at a stream and say whether it is batch or on-line.
EACRR is used to track the agents in the pool, and make sure that one (and only one) of each type is
running. In practice most back end systems are able to cope with multiple agents, as the agent recovery
is typically a reharvest which generates duplicate input anyway.
‘Stateless. No complex failover just restart.
Recovery by reprovisioning,
[DN: General note for Hydra systems - still need a reference for the provisioning process i.e. rebuild VSH,
drop in original . VHD files and then apply HNG-xHING-x (Hydra) fixes]
40-4.2.3.310.1.2.4.4[NRA] Network Banking Routing Agent + (Formatted: Bullets and Numbering ,I

Ref: NB/HLD/017

The NBX Routing Agent listens for [R1] and [CO] messages through a Riposte real-time message port,
add a time-stamp and routes each message to the appropriate NBX Authorisation Agent. There is one
Routing Agent instance for each Correspondence Server Cluster. The use of a Riposte real-time
message port ensures that the Agent will only process ‘fresh’ messages.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432

TKEYWORDS \MERGEFORMAT] Date: 23428-JulvMayNow-087

UNCONTROLLED IF PRINTED Page No: 91 of 126

POL-BSFF-0223764_0090
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Note that each physical NRA platform runs four instances of the Routing Agents. These run as
active/standby pairs exchanging heartbeats via Riposte.

Resilience is unchanged from Horizon,

10.1.2.4.5[ACF] Hydra ACDB + (Formatted: Bullets and Numbering

Ref: /DES/SYM/IFS/0001 (Connect DSL Interface)
Ref: TD/DES/150 (ACF Replay Design

Hydra Auto-config database.

ACF Replay is the process of re-issuing all ACF to all counters, e.g. if there is a key compromise. The
Hydra system (ACDB, ACS and SYSMAN2) needs to be capable of an ACF Replay until all counters are
migrated to HNG-x.

(IDN: May change to "KMS" model), _—(Formatted: Font: Taig Highlight

—~[ Formatted: Fort ale

10.1.2.4.6[0CM] Hydra OCMS + (Formatted: Bullets and Numbering

Hydra Outlet Change Management System,

10.1.2.3.410. 7_[BLS] Horizon Boot Loader * (Formatted: Bullets and Numbering

This provides an initial point of contact for a replacement counter to make contact with the data
centreData Centre and download its identity (the Auto Config File).

Although this does not sound like it has a very high availability requirement, because of the sheer number
of counters in the estate the swap-out rate is relatively high, and the engineers have an SLA to replace
the counter within 20 minutes of arriving at the branch.

Delays due to BLS being unavailable have a detrimental knock-on effect in scheduling of engineers for
servicing other branches.

40.4.2.3.510.1.2.4.8[BOO] VSAT Boot Server * (Formatted: Bullets and Numbering

Boot Server acts as a domain controller for the Boot Loader. It also provides a boot loader function for
satellite connected branches.

[VPM] VPN Policy Manager + (Formatted:

ullets and Numbering

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432

Date:
Page No: 92 0f 126

UNCONTROLLED IF PRINTED. TXEYWORDE: 'Y MERGEFORMAT

POL-BSFF-0223764_0091
POL00397094

POL00397094

[TITLE \" MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]
10.4.2.3.8 [VEX] VPN Exception Server + (Formatted: Bulets and Numbering )
Activelactive. No failover.
Recovery by reprovisioning.
40-4.2.3.910.1.2.4.9[DOM] Domain Controllers + (Formatted: Bulets and Numbering )
Ref: [DN: Need a ref for redesigned Hydra NT Domains) —{ Formatted: Hight J

The Hydra security domain is relatively complex. It is made up of the traditional NT4 PDC/BDC pairs, in
some cases with a second BDC for extra resilience.

Many of the servers in DMZ were their own PDC, which has caused some confusion when selecting
those that need migrating.

The design for Hydra Security Domains is stil not clear, but as a minimum it is expected that the following
will be required:

BOPSS - Bootle servers
WOPSS - Wigan servers

PWYKMAKMS - KMAKMS servers and admin users
PWYSAS - SAS Administrators,

PWYDCS - Support Users

Needs 2 security designer to comment and also-to take ownership of the Hydra NT Domain LD.
Note: technically SDC is a Hydra Domain Controller, but it is managed by a different team.

WOPSS-trusted-AD.-If--understood-why-this-was-difficult-1-_might-understand-why-we-still-need-the

10.1.3 Services in Production LPAN

There will actually be three Production LPAN's; one on each of the three primary site frames, plus
additional LPANs on each active/active frame.

Each LPAN will be allocated the necessary resources for the pServers that are contained within that
LPAN. It is possible to operate several LPANs on a single frame, e.g. to limit each LPAN to a particular
type of resource and prevent LPAN administrators from accidentally assigning an inappropriate resource.

These LPANs will be created on both the primary site PAN and the secondary site PAN, and in the event
of DR, following the network and storage failover, plus reassignment of any pServers from Test LPANs
the secondary Production LPANs may be started.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNow-087
Page No: 93 of 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0092
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.3.1 Support Services

These are support services whic
failover model.

are not required to it

plement DR and can support a Producti

10.1.3.1.1[SSC] SSC Support Server
Ref:

Runs in parallel with the Audit system collecting evidence for problem analysis by SSC. Typically users
will only report a problem sometime after an end of month report, so the evidence that led to the query
needs to reside in the estate for several months, and the main processing systems do not have space to
store data for this long.

This system also gets round any access or control issues of allowing SSC to gather certain evidence or
make audit queries to retrieve the same data.

In Horizon this operated active/active and the two systems were lazy mirrored using Robocopy. At HNG-
xHNG-x this system must fail over and data replication is provided by MirrorView, so a backup must be
taken for corruption recovery.

10.1.3.2 Systems Management
Ref: /ARC/SYM/ARC/0001
Ref: /ARC/SYM/ARC/0002
Ref: /ARC/SYM/ARC/0003,

Ref: /DES/SYM/HLD/0034 - SYSMAN3 Backup, Availability & Disaster Recovery Design

All the systems management services reside in the primary Production BladeFrames. They may be
deliberately spread amongst frames to distribute load or reduce sensitivity to power module failure.

POL00397094
POL00397094

Oracle Enterprise Manager is raising Oracle events to replace BMC Patrol __—{ Formatted: Font: Not Ric

IDN:-Where-does OEM Jive?
functionality, and is also hosting the RMAN Catalog Service. t will be hosted on EDS.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date:
Page No: 94 of 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL-BSFF-0223764_0093
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.3.2.1 [EDS] Enterprise Database Server
Oracle database on W2003.

Standard Oracle database failover model. Database should perform automatic crash recovery after
failover.

This provides the repository for Tivoli Provisioning Manager and Tivoli Configuration Manager.

10.1.3.2.2[EPM] Enterprise Provisioning Server

Software distribution management, Also provides the Provisioning functions, Branch Router management
and Tasks for Campus support

10.1.3.2.3[EFS] Enterprise Fan-out Server

The EFS will provide the Event Concentrator. It also allows performance scaling for distribution to a huge
estate. 40 EFS each handle up to approximately 500 counters

10.1.3.2.4 [EMD] Enterprise Monitoring Display

Top level monitoring server, forming the “Aggregation Layer” of the Event Management Environment.

All events forwarded from clients across the Campus and Branch Estate, subsequently processed and
forwarded from the EES Collection Layer servers, are made available here for view and action by the
‘SMC (and/or Automation),

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyidayNov-087
Page No: 95 of 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0094
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Configured with Tivoli NetCooVOMNibus (NCO) all Events are processed and stored within its proprietary
‘Object Server Database. This is an in-memory Sybase Database, a regular dump of which is taken via
internal Netcool automation processes.

Events are written out to the ERP, to the relevant Oracle Database for access with the relevant Reporter
Toolset.

10.1.3.2.5[EUI] Enterprise User Interface

This provides a portal service for real time views of monitored status of configured Servers.

10.1.3.2.6[EMM] Enterprise Monitoring Server

‘A component of IBM Tivoli Monitoring (ITM), the EMM fulfils the role of Tivoli Enterprise Monitoring
Server (TEMS). Campus distributed ITM Agents route status and “situation” alerts into this server. This
information is stored in a proprietary database called the EIB. From here, this information can be stored
for Historical viewing by Tivoli DataWarehouse which is housed within the EDS. Data is also passed to
the EUI Servers for RealTime view of Alerts by support groups.

10.1.3.2.7[EMS] Enterprise Management Server

Top Level Management Server, similar in function (and including some of the same components) as the
MASTERTMR within SYSMAN2

2 Servers required, 1 active — 1 standby in separate BladeFrame, to provide N +1 resilience.

A Secondary instance of the Netcool Security Manager Database is required on the Secondary server,
and this instance is automatically synchronised with the Primary instance.

10.1.3.2.8[EES] Enterprise Event Server

Second level servers, forming the “Collection Layer" of the Event Management Environment. All events
from clients across the Campus and Branch Estate (Server and Counter Log Messages) are processed
at these servers, after forwarding through the EFS NetCool Proxies.

Configured with Tivoli NetCooVOMNibus (NCO) all Events are processed and stored within its proprietary
Object Server Database. This is an in memory Sybase Database, a regular dump of which is taken via
internal Netcool automation processes.

Audit level events are written out to the EDS, to the relevant Oracle Database. Action level events to be

viewed at the Monitoring level are forwarded on to the “Aggregation Layer" NetCoo/OMNibus Object
Server, the EMD.

10.1.3.2.9[EAS] Enterprise Availability Server

Thewo EAS Servers will provide a Business Systems View of Events routed from the “Aggregation Layer"
of the Event Monitoring structure (EMD) and are-ppresented for view with Tivoli NetCool Realtime Active
Dashboards (RAD),

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyidayNov-087
Page No: 96 of 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0095
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.3.2.10 [ERP] Enterprise Reporting Platform

Event analysis and statistics. Secondary Oracle Database Server, similar but smaller than the EDS.

Netoool/Reporter is installed and used for customised in-house reporting from the relevant Database
‘schemas.

10.1.3.3 Estate Management
Ref: /ARC/SYM/ARC/0005

Ref: /DES/SYM/HLD/0030
_{DM:-Design-not yet. complete}

10.1.3.3.1[EST] Estate Management Server
What happened to BCDB -?2?Ref: DES/SYM/HLD/0039
The Estate Management System supports two databases:

EMBD the Estate Management Database is responsible for tracking the opening and closing of branches,
address changes, etc. It is used by the OBC team.

‘The RADIUS servers use EST during authentication but the RADIATOR cache will allow previous
authentications in much the same way as a domain login works on a disconnected laptop.

MTAS MID/TID Allocation Service. SQL-Server 2005. Hosted on DCSM at Horizon as this happened to

be in the Network Banking DMZ. Provides input to Authorisation agents, and sends files to Streamline as
MID and TID are allocated

Does not have a particularly high availability requirement as rate of change is low, but _as the banks
simply drop a Request with an invalid MID or TID faults in MTAS (or at the Streamline end) can be tricky
to track dows

LI

-( Formatted: Font: tale

10.1.3.3.2 [BCS] Branch Change Management Server
Ref: /DES/SYM/HLD/0024 (autoconfig)

Ref: /DES/SYM/HLD/0026 (BCMS)

Ref: /DES/SYM/HLD/0031 (BCDB)

Ref: /DES/SYM/IFS/0002 (BCDB to BranchDB)

Branch Configuration Management Service replacing OCMS. This will be based in the corporate estate in
Bracknell with a counterpart at the failover site.

(TDN: Currently Lewes, but this may change! (Formatted: Font: Italic )
DCopyright Fujisu Senvees Lid TSUBIECT ¥ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Warsion: 0482

Date:

[KEYWORDS \"MERGEFORMAT] Polen. ror fae

UNCONTROLLED IF PRINTED.

POL-BSFF-0223764_0096
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ] q
FU TSU [SUBJECT \* MERGEFORMAT ]
10.1.3.3.3 [ACF] Hydra ACDB + (Formatted: Bullets and Numbering }
Ref: /DES/SYM/IFS/0001 (Connect DSL Interface)
Ret TDIDESi250 (ACF Replay Design)
Hydra Auto-config- database.
ACE Replay is the of LACE to-all-countors,-o.g-if-th k loot
migrated to HNG-x.
40.4.3.3.4,0CM] Hydra. OCMS « (Formatted: Bullets and Numbering ]
« (Formatted: Bullets and Numbering )
d-sonde-fk ie.
line-end} can be tricky
+ (Formatted: Bullets and Numbering )
HYDRA Service based on VSH platform,
Counter Package Signing Server
40.4.3.3-710.1.3.3.4[DSS] Dimensions Signing Server + ( Formatted: Bulets and Numbering )
HYDRA Service based in BRAO1 and LEWO2 + Formatted: Normal )

Packages from Dimensions are signed before delivery to the estate. Tivoli (SYSMAN3) is able to

Verify the signature before using a package.

This system is outside the managed data centre service.

10.1.3.4 Branch Access - HNG-xHNG-x
IDN: Is this list complete?]

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT]
20087

UNCONTROLLED teRiNTED [KEYWORDS \* MERGEFORMAT]

Ref

Version:
Date:

Page No:

TDOCPROPERTY
“Document Number" \*
MERGEFORMAT ]

aga
23423-JulyMayNov-087
98 of 126

POL-BSFF-0223764_0097
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FU TSU [SUBJECT \* MERGEFORMAT ]
10.1.3.4.1[BAL] Branch Access Layer
Ref: /ARC/APPIARC/0004
Branch Access Layer based on Jnterstage grizzly from java.net. __-{ Formatted: Strikethrough )
There are 10 instances of BAL at the primary site. This number is to cope with a peak day transaction
load, and the service is designed to degrade gracefully, so in practice as few as four or five servers will
cope with normal day-to-day traffic.
Peak time is Monday and Tuesday momings, with extra load just before major public holidays, although
the change from OBCS vouchers to CAPO card withdrawals has smoothed this somewhat as a “double
payment” is only a single banking transaction whereas two vouchers had to be encashed.
BAL also performs the authentication management for counter connections, and ensures that
reconnecting counters are directed to the BDB node that they previously connected to.
In the event of a BDB node failure BAL redistributes the reconnecting counters amongst the remaining
nodes. Approximately 4000 branches (average 8000 counters) will be reconnecting in this event.
10.1.3.4.2[BMX] BAL Management Server
Ref: /DES/SYM/HLD/0021 72?
‘Server with toolset to allow management of the services on BAL.
This server also collects statistics from the BAL platforms and processes them to provide SLA reporting
information.
IDN: To where does it send them and how resilient is the mechanism? Are these "on-line" stats or
historical?]
ION: As interstage is no longer on BAL do we need this platform any longer?) __-{ Formatted: Highight )
10.1.3.4.3[BPL] HNG-xHNG-x Boot Platform
Strictly this is part of Estate Management.
This loads the HNG-xHNG-x equivalent of the auto-config file to an HNG-xHNG-x counter spare,
As for its Horizon equivalent, although this does not sound like it has a very high availabilty requirement,
because of the sheer number of counters in the estate the swap-out rate is relatively high, and the
engineers have an SLA to replace the counter within 20 minutes of arriving at the branch.
This is also responsible for the RCF for Branch Router Provisioning
IDN:-Doss it have-any other purpose?]
IDN:-Need-to-see Estate Management_Design.This-is -EM_{counter provisioning)-rather-than-Branch
Access}
10.1.3.4.4 RADIUS servers * ullets and Numbering )
Ref: /DES/PPS/HLD/0044.
Gcopyight Fuftsu Services Lid TSUBIECT ¥ MERGEFORMAT] Rat TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.482
¢ Date:

UNCONTROLLED IF PRINTED KEYWORDS! V MERGEFORMAT] = Page

POL-BSFF-0223764_0098
[TITLE \* MERGEFORMAT ] 2
FUJITSU [SUBJECT \* MERGEFORMAT ]

LPAN.Thie-d ¢ provide N+4-roeill he mv he

The reason for having these deployed active/active is that in the event of loss of a site (especially IRE11)
the toFe-will-immediately-tey-t -and-without-a RADIU: iable-will

retry. This-puts- high-load-on the C&W- network.
[DN-What-are-they actually connecting to7}

40-4.3.4.4.3RAD RADIUS Aecounting Server
10.1.3.4.4.4RDD-RADIUS Dialled 4ISDN)-Server

k om
that can be copied over the ADSL line to check the connection speed before installing a-counter.

hi ™ PCh ard ble-Tivoll-to-diat - ISDN. The-calie

It-was-thought that all-ISDN-connections-would-have-been migrated to ADSL by-the-time of HNG-x, but
chis Oks dnukely and. this service IS Taqthied sInHLthe as ISON counerss eaplacacl

BCopyiight Fujisu Services Lid TSUBIECT ¥ MERGEFORMAT] Ref TDOCPROPERTY
2087 “Document Number” \*
MERGEFORMAT }
Version: 0.432
Date: 23123-JulyMayNov-087
UNCONTROLLED IF PRINTED TKEYWORDS V" MERGEFORMAT] Page No: 100 of 126

POL00397094

POL00397094
(Formatted: Bulets and Numbering 5)
(Formatted: Buets and Numbering }
(Formatted: Bullets and Numbering )

POL-BSFF-0223764_0099
POL00397094

POL00397094
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]
10.4.4.4.42[NFM]Firewall Manager + (Formatted: Bullets and Numbering )
Ret
Hosied.on-BladeFrame
I and should certainly not be required during a DR 7
40.4.4.4-45[SYS} Syslog Server + (Formatted: Bulets and Numbering )
Rei DESNETHLD06+2

Active/active.-All: systems. write to both-syslog_servers.. The syslog-servers-may forward-events to Tivoli
via-NetCool probes, but it-is_ likely that-any-events will have-already generated_an_snmp_trap_via
OpenView.

‘The Branch Routers will-also-report-to the syslog servers, so these are ‘fairly high-performance.

10.1.3.5 [BDB] Branch Database
Ref: /DES/APP/HLD/0020

BranchDB is a 4 node Oracle10gR2 RAC Database. The nodes are hosted in the BladeFrame

The mechanism for ensuring persistent connections and load balancing and for managing failed nodes is
described in detail in the Branch Database HLD (APP/ARC/HLD/0020), with high level context in the
Online Services Architecture (ARC/APP/ARC/0008) and the Branch Database Architecture
(ARCIAPP/ARC/0005)

‘As well as being highly resilient and providing a very high transactional throughput, the data partitioning
scheme allow for more nodes to be added for scalability. In fact the data is typically partitioned 128 ways,
0 a corrupt table will only affect a small percentage of outlets, and there is every chance that Oracle
Recovery Manager can repair the problem.

10.1.3.6 [BDS] Branch Standby Database
Ref: /DES/APP/HLD/0020

This is a second copy on a separate EMC storage array. The replication is using the Oracle DataGuard
mechanism, which extracts changes from the transaction logs and applies them to the standby. This
means that data corruption is unlikely to be transmitted.

IDN: Current thinking is that this is node 5 of the Branch Cluster (and we may need node 6) which
operates two services (databases), the primary and the DataGuard standby. The structure of this section
changes very slightly if this is the case]

10.1.3.7 [BRS] Branch Support Database
Ref: /DES/APP/HLD/0023

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Wesson: 0432

-JulyMayNov-087

[KEYWORDS \"MERGEFORMAT] ale, Spiess

UNCONTROLLED IF PRINTED.

POL-BSFF-0223764_0100
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Branch Support provides two primary services. The first is SLA reporting via the TWS batch schedule,
and the other is a historical view of data for SSC to analyse support problems.

Oracle Streams replication keeps BRSS within a few minutes of real time normally, but some catch-up
may be required after local DataGuard failover or site failover.

BRSS is regarded as a lower priority service for recovery than BRDB, BRSTBY, BAL and other counter-
facing services such as NWENBS, APOP, etc.

10.1.3.8 Network Banking

10.1.3.8.1[NPS] Network Persistent Store
Ref: /DES/APP/HLD/0017

Network Persistent Store. This provides a stateful store for the Banking Authorisation Agents, Debit Card
Authorisation Agents and e-TopUp Authorisation Agents, and also a mechanism for the agents to heart
beat.

NPS also supports the Track & Trace agent.

The NPS database is a highly available store for transaction journals created by the authorisation agents.
The agents themselves are stateless to simplify the agent resilience model, and the NPS allows such
features as transaction reversal to be managed by any agent.

The basic availability requirement is to recover within the time it takes a card to be re-swiped. Failed
(unacknowledged) transactions time out after 30s, and typical recovery following node failure is <60s.

The database is hosted on RHEL based two-node Oracle RAC in the BladeFrame. The platform also co-
hosts the APOP database.

Each Authorisation agent connects to both RAC instances upon start-up for critical processing threads.
This places a high memory requirement on the server but reduces the failover time for the agent should a
node crash.

The use of Oracle Recovery Manager (RMAN) for backup and recovery allows an even more timely
recovery from corruption than at Horizon, but in practice this has not been an issue in service.

10.1.3.8.2[NAC] CAPO Authorisation Agent
Ref: /DES/APP/HLD/0009

‘Authorisation agent for Card Account.

‘Two agents run, an A and a B instance, to provide performance. There is one instance of each on hot
standby, heart beating via NPS, thus a total of four NAC platforms in the data-centzeData Centre.

CAPO accounts for 85% of transaction volume, and outages are highly news-worthy.

Stateless. No complex failover just restart.

After DR a successful key exchange event indicates that the service is communicating with the FI

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNov-087
Page No: 102 of 126,

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0101
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.3.8.3[NAA] A&L Authorisation Agent
Ref: /DES/APP/HLD/0009

Authorisation agent for Alliance & Leicester, the only bank to sign up for an individual service with POL.

‘Same resilience model as CAPO, but a single platform is able to host both A and B instances. Normally
‘one platform will host the active A instance and the other will host the active B instance.

‘A&L accounts for <5% of transaction volume.
Stateless. No complex failover just restart.

After DR a successful key exchange event indicates that the service is communicating with the Fl

10.1.3.8.4[NAL] LINK Authorisation Agent
Ref: /DES/APP/HLD/0009

Authorisation agent for Link, the network covering most other major banks.

‘Same resilience model as CAPO, but a single platform is able to host both A and B instances. . Normally
‘one platform will host the active A instance and the other will host the active B instance.

Link accounts for around 10% of transaction volume.

Stateless. No complex failover just restart.

After DR a successful key exchange event indicates that the service is communicating with the FI.

Note that whilst NAC and NAA initiate connections to the Fl, NAL accepts. If there is a long outage, such
as may occur during a DR, Link will need to be contacted to re-establish connection.

10.1.3.8.5[TWS] TES & APOP Query
Ref: /DES/APP/SPE/0001 (can't see an HLD)
Ref: 227? (APOP)

Oracle forms server to allow POL (Huthwaite & Chesterfield) users to query APOP and TES databases in
a controlled manner,

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: Jul
Page No: 103 of 126,

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0102
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

There is also statement about availabili

(Formatted: Fort aor Auto

)

<
‘The Online Query Too! wil be available under normal operation on a 24 x 7 basis, bar Dally maintenances,\ > Formatted: lustfied, Space After. 6 pt, Adjust space between
{for the On line Query fool may be taken overnight for up to 30 minutes during a possible 4 hour dally \\ I Latin and Asian text, Adjust space between Asian text and

Maintenance. Fujitsu Services will advise on the starl and finish times of the Daly Maintenance window so \\\\ [numbers

this can be communicated tothe users ofthe TES (see NBXO1T6e \ (Formatted Fort over to }

‘The Ondine Query Too! wil ave an availabilty of 9.75% during 0700 to 2200 measured annualy and Formatted: Justified Indert Lefts 1-an, Space Men 6 pe

feported monthly on an exception basis. Adjust space between Latin and Asian text, Adjust space
Deneeen Alen tat andurbers

Oracle forms server to allow POL (Huthwaite & Chesterfield) users to query TES in a controlled manner.
Stateless_No complex failover just restar:

The APOP service is an Oracle forms server co-hosted with TESQA. The interface aAllows query and
update of APOP for releasing and managing vouchers (e.g. Postal Orders). Each stream of vouchers has
a different set of users.

Stateless. No complex failover just restart.

10.1.3.9 Other online services

10.1.3.9.1APOP Automated Payments and Out-Payments.
Ref: /DES/APP/HLD/0011

Co-hosted on [NPS] Platform

Data repository for Automated Payments Out Payments service, which tracks vouchers such as Postal
‘Orders to check for lost, stolen and forged vouchers, and to provide a position on the outstanding cash
that POL has issued as vouchers but not redeemed.

Also provides stock tracking of vouchers, and a report on unredeemed (out of date) vouchers.

Primary interfaces are a bulk upload of voucher details as they are issued and an update by Post Masters
as vouchers are transacted through [AWS].

Headquarters can also query and update records via a service on [TWS]

10.1.3.9.2 [KMN] Key Management Service
Ref: /DES/SEC/HLD/0003

I TDN: Details to be confirmed], _—( Formatted: Fort: Italic, Highlight
~{ Formatted: Fort: ale
‘Soopyigh Fut Sentees Led TSUBIECT ¥ WERGEFORMAT] Ret TDOGPROPERTY
1 2008? "Document Number” \°
MERGEFORMAT]
I Version: 0.432
[KEYWORDS \* MERGEFORMAT] Dates Beas lulyleNov-O8?

UNCONTROLLED IF PRINTED Page No: 104 of 126

POL-BSFF-0223764_0103
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

t=
A
(7:
3
Fl RSA
Branch I soning
Layer BAL) I sash'2k
C1
Termination I SSt-Cor
‘mal
1
:

1
eK
Banking Te
‘Agents Ly
Tok
Key Server Operate

RA Operatr PEN Key Haer CA Operator

tis understood that the Key Server will operate as a load balanced pair with the NPS as a repositon

which is analogous to the model used for other web services.

Itis not clear whether the KMN platform is reat

1d during the DR process and therefore needs to be in

the ACTIVE LPAN, but it is critical in allowing other services to start, and must appear early in the service

start-up ordering,

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT]
20087

UNCONTROLLED IF PRINTED.

[KEYWORDS \* MERGEFORMAT]

Ref TDOCPROPERTY
“Document Number" \*
MERGEFORMAT ]

Version: 0.432

Date: 23423-JulyMayNov.087

Page No: 105 of 126,

POL-BSFF-0223764_0104
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.3.9.3[DCA] Horizon Debit Card Agent

Prior to the completion of PCI Compliance Horizon counters will continue to use the Horizon DCS Agent,
but re-hosted in Belfast.

He ie a SoqUe io Kop aiaka a book the of the IRE4 falling

Hier bik ‘i

This is the only sysiem_in Belfast that-will-require-physical-key-disks (REIIDEVéWOHASLUSHICTSHIHE
cave? coe aleo-socion 9:ij-Be-sistd io.

Post PCI Compliance and for HNG-xHING-x counters the HNG-xHNG-x DCS Agent will be used.

This platform also hosts the Hydra ETS agent which continues for the duration of the Hydra phase.

Resilience will be the same model as now, with two agents per cluster, one at each site, in
Active/Standby mode using Riposte hear! beats to decide which is Active (a total of eight platform
instances)

Stateless. No complex failover just restart.

10.1.3.9.4[DEA] Debit Card & eTop-Up Agent
Ref: /DES/APP/HLD/0007 - DCS
Ref: /DES/APP/HLD/0008 - ETS:

Provides an authorisation service for debit card payments via an X25 link to the Merchant Acquirer
NatWest Streamline, and a similar authorisation for eTop-Ups.

N+1 local resilience is achieved through a pair of agents running active/standby, heartbeating via NPS.

10.1.3.9.5 [DCM] Debit Card Management Server

POL00397094
POL00397094

Ref: /DES/APP/HLD/0055 DCS Bulk File Agents 22? (Formatted: Not Highlight

Ref: /DES/APP/HLD/0078 (Streamline)
Ref: /DES/APP/HLD/0077 (eTop Up)

Send and receive settlement files. Input from a DRS share on DAT is converted into a payment file. The
response is an EMIS file which is converted into a C4D file for DRS.

Streamline,

With PCI compliance this server de-obfuscatecrypis the PANs in the Payment file before transmission,
and temporarily stores the result on encrypted file store for sending via fip to Streamline. The DCM also
encrypts the PANS in the EMIS file before passing the C4D feed to DRS.

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432

Date:
Page No: 106 of 126,

UNCONTROLLED IF PRINTED. TXEYWORDE: 'Y MERGEFORMAT

POL-BSFF-0223764_0105
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

In Horizon tThe bulk of the days transactions are sent at 1500 with a tail at 2000. The previous day's
EMIS file is retrieved at 1500 as the payment file is sent.

Pe . k-be-shut-down-for-th the day-—For HNG-XHNG-X we'vethis is reduced this to
lust the 21:00 payment file and the EMIS file is received when Streamline send it

‘The availability requirement is rather low

POL00397094
POL00397094

: — (Formatted: Fort ae

10.1.3.9.6Web Services
Ref: /DES/APP/HLD/0010 (Generic)

All web services operate as a load-balanced pair. In the event of one failing the other is used
automatically.

All of these systems are stateless and may be rebuilt if they fail. The existence of the partner system
provides a continued service while the rebuild occurs.

10.1.3.9.6.1 [MWS] MoneyGram Web Server
Ref: /DES/APP/HLD/0014

Requires KMN to start up.
Authorisation of “international postal orders".

10.1.3.9.6.2 [PWS] PAF Web Server
Ref: /DES/APP/HLD/0015

Post Code look-up application. This has an internal "database" updated by controlled software release.

10.1.3.9.6.3  [DWS] DVLA Web Server
Ref: /DES/APP/HLD/0012

Interstage with link to extemal service for looking up vehicle license details.

Stateless. No complex failover just restart.

10.1.3.9.6.4 [OWS] Online Training Web Server
Ref: /DES/APP/HLD/0031

GCopyright Fujtsu Services Lid [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: “Jul
Page No: 107 of 126

. 087
UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT .

POL-BSFF-0223764_0106
POL00397094

POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Provides a dummy service for training counters

10.1.3.9.6.5 [HWS] Help Desk Web Server
Ref: /DES/APP/HLD/0013

This service allows logging of problems with the help desk via a web service rather than via a phone call.

10.1.3.9.6.6 [AWS] APOP Web Server
Ref: /DES/APP/HLD/0011

This service provides authorisation for redeeming vouchers.

Italso allows counters to update APOP voucher records as vouchers are sold and redeemed.

See also [TWS] which allows Headquarters to query and update APOP. There are two different "APOP
Web Services" which is confusing

10.1.3.9.6.7 _[BWS] Telecoms Web Server ~ (Formatted: Heading 6 )

Ref: /DES/APP/LLD/0158 (Formatted: Bullets and Numbering )

Ref: SVM/SDM/OLA/0002 _-[Formatted: Portuguese (Brazil) )

Ref: SVM/SDM/OLA/0003, _-(Formatted: Portuguese (Brazil) )
Ref: SVWSDM/OLA/0004
Ref: SVM/SDM/OLA/0005

This service provides authorisation for selling broadband services at the counter. A number of+——{ Formatted: Normal )

Operational Level Agreements exist covering the various stages required to check the customer's
creditworthiness and the availability of broadband for their address.

10.1.3.10 File Transfer

10.1.3.10.1 [CDG] C:D Connect-Direct server

Ref: /DES/APP/HLD/0095

Ref: /DES/APP/HLD/0052 (PCI Agents)

‘The CDG decrypts PANs in the outgoing REC files and obfuscates PANs in the incoming LREC files.

Used to transfer files to and from Financial institutions as C:0 is an industry standard.

DCopyright Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432

Date:
Page No: 108 of 126,

UNCONTROLLED IF PRINTED. TXEYWORDE: 'Y MERGEFORMAT

POL-BSFF-0223764_0107
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

Itis similar to FTMS except that itis stateless and does not return receipts.

‘As with DCM the availability requirement of this platform is relatively lowzero outside the window when
the REC and LREC files are being transmitted. Even within the window there is considerable margin.

IDN: Need to confirm SLA's on REC file delivery and normal delivery time]

10.1.3.10.2 [PLG] FTMS TIP Local
Ref: /DES/APP/HLD/0051 (FTMS)

FTMS was developed for Horizon to provide a system that could deliver files to a service boundary with
proof for SLA reconciliation.

Although called TIP, TIP has long since been replaced by POLFS, and the old TIP feed (slightly modified)
now goes to POL-MIS.

The TIP gateway is also used by RDS to send files to RDMC and ADS for two-way traffic to LFS and
POLFS.

The critical SLA was delivery of the transaction files to TIP, but the equivalent (BLE) files go direct from
TPS to POLFS now. There is an SLA on the PLO Early file from ADS being available to the counters b:
0700, and failure of the system during the core day interrupts flows of data such as ADS pouch delivery
and collection details and Bureau de Change updates and memos to Post Masters fed via the Reference
Data system,

10.1.3.10.3 [FLG] FTMS EDG Local
Ref: /DES/APP/HLD/0051 (FTMS)

Ref: /DES/APP/HLD/0079 (EDG FTP Pull)

Ref: /DES/APP/HLD/0080 (EDG FTP Push)

Ref: /DES/APP/HLD/0081 (GIRO FTP Push)

EDG (Electronic Data Gateway) will have replaced individual remote AP clients before migration starts.
Principally used for sending files for AP olients to POL for collection by the client from a POL extemal
facing system (similar to Equifax), it is also used for transmitting other files between Norther Data
‘ContreData Centre and Horizon because the TIP gateway hit a Windows limit on the number of services
that could be run. POLFS has a number of feeds on this interface.

FTMS (GP) is the remote in Stevenage. One of the main transfers is reference data (outlet address
changes) to the D1 Engineering Support System which sends out spare parts. GP may not be needed at
HNNG-xHNG-x,

[DN: D1 is being shut down. Not sure what is replacing it. Graham Welsh to_advise of any changes
needed. JJDN:-GIRO feed is-as4i I the-rost put togeth this al EDG2}_That's-th "

plan-but-as-an-extra service

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY

20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNow-087
Page No: 109 of 126,

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0108
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

‘The T&T EDG Agents also run on the FLG. While there are Horizon Counters working there will be five
‘such agents - one per Riposte cluster (represented in the NPS) and one for HNG-X

Ref: CRICDE/018
Ref: CS/OLA/O57
Ref: ASIDPR/013

Ref: DE/LLD/019 - EGD-EDG Interface Agent
Ref: DE/LLD/019 - Harvester Agent

When a T&T barcode is applied to a package a record is forwarded to a 3rd party (which may be
ParcelForce) via the EDG gateway.

The resilience model of the EDG gateway [FLG] assumes that fairly low availability is OK as it
delivering files to APS customers.-222

is just

The T&T agent may have higher availability requirements.

Stateless. No complex failover just restart.

10.1.4 Reference Data Test (RDT)

RDT operates as an isolated environment in Horizon. This could either be managed as a DMZ in
Production or as an independent LPAN. The DR requirements are fairly relaxed and merely reauire
working systems at a reasonably coherent point in time.

10.1.4.1 [RSH] RDT Solaris Host

Four per site, one for each service PL IV OV T. There is no planned data transfer between sites,
as any of these systems should be able to reload from exports on the resilient RDT file share.

10.1.4.2 [RLS] RDT Linux Server

BRDB, NPS and APOP may share a platform. In order to simplify DR these systems have been+
implemented in a way that allows EMC Clariion MirorView or SAN Copy to be used. Any such
requirement is likely to be once per week as in Horizon (tape transfer to LEW02 from BRAO1).

DCopyignt Fujisu Services Ld [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date: -JulyMayNow-087
Page No: 110 of 126,

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094

POL00397094
{{ Formatted: Bullets and Numbering ]
(Formatted: Dutch (Netherlands) )
‘Formatted: Dutch (Netherlands) )
(Formatted: Normal }
(Formatted: Normal }

POL-BSFF-0223764_0109
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.4.3 RDT Web Services

‘These systems are stateless, but must be kept up to date with any fixes. Replication of the+——{ Formatted:

Clariion disks through SAN Copy is the simplest mechanism.

10.1.5 Services in Test LPANs

There will be one LPAN per test system, each named after the test system as per the naming convention
in DESIPPS/HLD/0006 Naming Standard, e.g. ST for System Test. If a Test service is spread across
several frames there will be one LPAN on each ‘frame, each containing the appropriate set of resources
for that frame.

A Test LPAN only has access to the resources which the global PAN Administrator has granted, so even
if Testers are given LPAN Operator or LPAN Administrator roles they do not have access to other
resources.

Itis thus possible to safely trunk a frame to both core and access switch layers, and to present all VLANs
to all trunked ports, as the creation of a vSwitch which maps to each VLAN is only permitted for the global
PAN Administrator, who must explicitly assign the vSwitch to the LPAN. If a vSwitch has been assigned
to several LPANs the LPAN administrator cannot tell that, he can only tell that the vSwitch is assigned to
the LPAN being administered.

‘A.user could be given a role of ST-LPAN-Administrator and RV-LPAN-Operator, and would then be able

to see a wider range of resources, but still only assign ST resources in the ST LPAN. An Operator is
allowed to start and stop pServers and see events, which is appropriate for SMC (2nd line) type users.

‘A vSwitch may be presented to several LPANs. If there is some infrastructure which needs to be shared
between Production and Test systems (a NAS based software share, for instance) this can be managed
ina controlled manner.

Each test stream runs in its own LPAN to prevent resource contention. These are generally scaled down
copies of the services described above, but for many systems scaling down does not make sense.

There may be a need for the long term Test service to run full scale transactional volumes, at least for a
single peak day. TBD.

In the event of DR these services wi

irst be shut down

There is a requirement to operate a minimal test service to enable release of important fixes during DR. It
is not yet certain how this will operate, but in principle there is not a problem operating this LPAN
concurrently with any Production LPAN(s).

10.1.6 Test only services

‘Some components will exist in the data centreData Centre to support testing, for example ther
‘complete set of backup servers at both sites for Test. These are part of the managed service.
Some components of the test rigs will not be hosted in the data-centreData Centre. This includes all the
test counters, workstations, emulators, crypto devices, etc. which testers need to interact with directly.
The testers normally operate from the BRAQ1 site, but in the event of this being unavailable operations
will continue from a secondary site (currently LEWO2). Alll the required servers and workstations wil
already provisioned at this site, and the site will be tested for operational effectiveness as part of
Business Continuity Plans.

a

POL00397094
POL00397094

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
Date: 23423-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

UNCONTROLLED IF PRINTED PageNo: 111 of 126

POL-BSFF-0223764_0110
[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.7 Remote Services

10.1.7.1. Workstations

All remote sites have a DR equivalent, All workstations and network connections will be pre-built and
available in the event of a DR. Periodic business continuity testing will exercise this equipment and the
processes for accessing the secondary site according to the published yearly schedule of tests,

10.1.7.2 FTMS Remote

FTMS Remote platforms [FRG] and [PRG] operate both local N+1 resilience at the primary site, and have
DR equipment located at the customer secondary site

‘The DR equipment is exercised in concert with the customer DR testing for example the RMG NDC DR
Rehearsal Plan.

The local resilience is tested according the published yearly schedule of tests. This may include the
customer testing their own procedures for retrieving files from the secondary platform,

(The Northern BetecentreData Centre (Huthwaite) Disaster Recovery Test is currently taking
place, OCP17279 refers.)

10.1.8 POL-FS

POL-FS will continue as under Horizon. This is being proved by the "Pathfinder" project.
Ref: [Need a PID ref from Chris Credland]

SVM/SDM/SD/0003 Annex B describes the service

EA/DPRIOOS describes the physical estate and the SAP landscape.

EA/DESI001 describes the re:
EA/DESI002 describes the SAN, but this is superseded by DES/NET/HLD/0007

‘The IRE19 HNG-x systems are used by Fujitsu Services testers. The POLFS systems are used by
PRISM testers (based in Huthwaite) so form part of a managed service offering to an external customer.
Delaying an urgent fix to the Production system is the POLFS equivalent of not having LST operational.
‘Two services (HNG-x and POLFS) to coexist in the same infrastructure. This requirement comes from
SVM/SDM/SD/0003 Annex B, and is due to the much longer DR time for POLFS. Scheduling the POLFS
DR test to coincide with the HNG-x DC test is beyond the scope of this design.

We do not wish to invoke HNG-x DR simply because the main SAP database server had failed.

nce and failover.

10.1.8.1 PLP
Production R3

10.1.8.2 PXI

Production XI

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]

Version: 0.432
Date:

-JulyMayNow-087
Page No: 112 of 126

UNCONTROLLED IF PRINTED EKEYWOBNE! T MEREEFORMAT

POL00397094
POL00397094

POL-BSFF-0223764_0111
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ] 2
FUJITSU [SUBJECT \* MERGEFORMAT ]

10.1.8.3 PLQ
QATest R3

10.1.8.4 QXI
QATest XI

10.1.8.5 PLE
Volume Test (talks to BTC7 in Horizon)

10.1.8.6 PLD
Development R3 (Prism)

10.1.8.7. DX!

Development XI (Fujitsu Services)

10.1.8.8 PLS
Support R3
Currently suspended as disks have been assigned to PLP. Will be resurrected as part of Pathfinder.

10.1.8.9 PLM
Monitor R3
Currently suspended as disks have been assigned to PLP. Will be resurrected as part of Pathfinder.

10.1.8.10 DSP

Production IXOS (OpenText) Archive. This is a simple Oracle database that is used to index the images
stored on Centera.

10.1.8.11 DS

@ATest IXOS (OpenText) Archive. Shares storage with Production but using a separate disk area.

SCopyright Fujtsu Services Ud [SUBJECT \ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
Version: 0.432
te: 23423-JulyMayNov.087

[KEYWORDS \* MERGEFORMAT]

Da Fate
UNCONTROLLED IF PRINTED. Page No: 113 of 126,

POL-BSFF-0223764_0112
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ]
FUJITSU [SUBJECT \* MERGEFORMAT ]

11 Security

System fails because of network intrusion or mali

ious activity:

This is an indeterminate failure. A damage assessment would need to be performed before
permitting failover. This is mitigated by the security design, and the implementation of intrusion
detection and response. Failover will only be implemented at the direction of an RMGA Service
Manager who will have first sought Customer approval.

Ensure only users with the correct roles can initiate any of the failover steps:

There is a need to avoid "back doors” to enable support or invocation of DR, and low level
designs will be reviewed for this. This risk is then mitigated by normal security and access
policies under ID Management.

Users need to be trained
‘A programme of introducing support staff to the processes is outlined in Section 2.
Running incorrect scripts could damage systems:

Any scripts must be fail-safe, and must make comprehensive tests that the system is in the
correct state to initiate the action performed by the script. It will be possible to force invocation
this should not be the norm and any warning issued should make this clear. This will be stated in
the "design principles” section of any low level design document dealing with Disaster Recovery,

‘Separation of Production and Test dom:

The external, shared infrastructure is managed by teams who have access and clearance for the
Production system, and those management interfaces are essential to timely DR, and will be in
the Production domain. Strict working practices and naming conventions will ensure that there is,
a very low risk of presenting Test systems to the Production estate.

3

Copyright Fujtsu Services Ud TSUBIECT ¥ MERGEFORMAT] Ret [DOCPROPERTY
20087 "Document Number” \*
MERGEFORMAT }
Version: 0.432
. Date: 234123-JulvMayNov-087
UNCONTROLLED IF PRINTED TKEYWORDS V" MERGEFORMAT] ——PageNo:114of 128

POL-BSFF-0223764_0113
FUJITSU

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

12

Scenarios

The resilience scenarios should all be covered by the SRRC published for each service.

If this level of detail for the overall service is needed it should be in an LD (equivalent to the aircraft
industry Failure Modes, Etfects and Criticality Analysis (FMECA) which is used to satisfy the certification

authorities as to fitness for purpose - there are useful analogies here).

Disaster Recovery is an indeterminate problem, and a decision making process is more appropriate than
a list of causes. The Business Continuity Framework (FS/BUA/SPP/001) is intended to guide the writing

of such process documents.

In general one must review the degree to which service has been lost or is about to be lost and balance

POL00397094
POL00397094

(Formatted: Bullets and Numbering B)

that against the definite outage that will be caused by a site failover.
48List of Platform Types -as-of 45-NOV-2007
Thie ie not part-of-the-d but ie attached for-raf The -finallist-of platt si
subject fo-a number-of CPs.
Windows
Radius —ADSL Radius-Server_for-Authentication-of Access—and 2003
si Sener ‘ADSL connected branches ‘Authentication Radius Caete! I Se
Sener
‘ADSL_——Test Access—and Windows
Aor [sone Used totestnew ADSL-connections, Access and Windows I Migrated Yes
Windows
arc — AudiSene = Application —-2003_SQL I Migialed Yes
wi syst So
RHEL
aws ¥. Application Migrated Yes
(ita
FU—Teet—Buk
FEU Tost BulkLoader 4
ok. Testing Migrated No
RHEL
Tolecome—Web
Telecoms Web Server =
ews Applicaton Migealed Yas
Wirwad)
Runs—the—ConnectdDirect gateway,
GonnectDirect Windows
Used to~transfer-datato-and-from tion at fos
cos ¥. : Applicat 2008 Server I Migrated =
Connect —Direct Windows
s ‘Simulates-the ConnectDirect eonvce,
cos ¥ Testing Migeaied No.
‘GCopyight Fujisu Services Lia TSUBIECT ¥ MERGEFORMAT] Ref TDOCPROPERTY
20087 “Document Number” \*
MERGEFORMAT ]
Version: 0.432
. Date: -JulyMayNov-087
UNCONTROLLED IF PRINTED TKEYWORDS \"MERGEFORMAT] Page No: 115 of 126

POL-BSFF-0223764_0114
[TITLE \* MERGEFORMAT ]

FUJITSU [SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

con E Solaie I Migrated Yes
; Main-Urie-database-conver-forback-
DAT ¥ Solaris Host ‘Application Salas Migrated Yes
Rune-the-bulk-agenis-that-pass data
to-and from Streamline for debl-card
Debit card Windows
DCM ¥ Management processing, andthe —MIGFID  Appication 2003 SOL I Migrated Yoo
Si allocation service (MTAS) database Se
terminal identifiers,
FTU_Tost_DVLA
HTUTost DVLA njoctor .
Cvamr Feating Migrated No
RHEL
a: I ‘Server platform on the EOS physical platform gphontan ‘Server baal ial
(Virtual)
EMC —Controt
end Windows
Centre — Main Controls the EMC storage. ‘Soret:
Ee Gon i: Siagowe I, I Migrated Yeo
AG Y¥ ee Application — Windows I Migrated Yes
FRG y FIMS———EDS-Remole EDG-FTMS instance, Rune at Windows =
Windows
Radlus—GPRS Radlue-Server_for-authenticaton-of Access and 2003
oe ‘Server GPRS connected branches. Authentication Radius Load ee
Sener
INS YU Testnjector FFU Test Injector esting = Migrated No
excel Simulator
ux ¥ esting Wodows I Migrated No
Worksiation used by GS Management
MIS MIS Client Support—Unit-_for access to Data ‘Geerstonal Workstation — Migrated No
Wearehouse, DRS-and TESQ niin
Workstation used by support primarily
hae ms Worketaion — Mugeaiad No
workstation application.
RHEL
wws ¥ Application Migrated Yes
eval
aL
NAR Y Authorisation Application Migrated Yes
access o-Aliance-and Leicester,
braces 2003-Server
ARO ‘Rune—agenisthat_provides—oncine
mac ¥ ‘Oi Wodene Yee
Sener ‘Account eewice (CAPO) caiciaaiiad
tink ‘Rune—agenisthat_provides—ondine
NAL Authotisation ——secone-to-the-Link-natworifor-cash Application WOME grated You
‘Server withdrawals. cal
Sopyight Fujtsu Services Ld TSUBIEGT ¥ WERGEFORMAT] Rar [DOCPROPERTY
20087 "Document Number” \°
MERGEFORMAT }
Version: 0.432
" Dat 23323-Juh 8
UNCONTROLLED IF PRINTED [KEYWORDS \* MERGEFORMAT] Page 416 of 126

POL-BSFF-0223764_0115
[TITLE \* MERGEFORMAT ]

FUJITSU [SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

Database —server—shared across
Network Banking, Electronic Top-up es
Network 2 Gracie
ws yo ‘ Database il
AACR Generally anon se NBS, eres
HU teet Bae
ITU Test PAF Injector zi
PAY esting Migrated No
rhe local-sener_for the TIP-F
PG ¥ — FIMSTIPLocal 7 FIMS  pppication — WIRdOWS I Wigrated Yo
PINPad—Test Used by-T a Pin
per 3 on I Migrated No
sae Windows
E Remote TRETMS instance,
eRe Applicaton Migrated No
Windows
PWS Y¥ PAR Rune the PAF web AReetoge I Migrated Yes
Server
Rad
I em connections from dia
RAD Accounting Renate I soa Migssled Yoo
Sete 2d branches. Server
Windows
Radive Ds! RadiusServertorauthentcation-of Access—and
ROD (Authentication) a = Migraied Yeo
Server pe
Workstation to access the Reference 7
ROMC Operational
ROM workstation — Support Ne
TU Feet Counter ;
ba ITU Tost. Counter Simulator
sm Testing Migrsled No
KG Generation on I Migrated No
Workstation KY ona
S86 ¥ — SSCSemver 9 20s Sever I Mirated = Yes
‘SSC—Suppor ‘Operational
Used by SSC for support tacks.
ssw ¥ 2 Workstation I Migrated No
‘Workstation — where PiNpads—for
“raining PIN Rad ‘
sire fe Covnr—Tlning—Oflees {fear Operstonsl — worsteton I Migrated No
fiom-Live)-ere-loaded-with-securty _ Support
Twol Workstation used to manage the Twol Systems
= Workstation environment. Management Workstation Migrated No
PWS ¥ FES WebSoner se ‘Application 1° I wigioted Yes
parts ofthe APOP application. Sener
Domain
Accoss—and Windows
ACD ¥% — Conivollers_—- Active Diractoryfor HNG-X. New Yos
ec natog, ‘Authentication 2003 Server
DCopyright Fujisu Senvees Lid [SUBJECT ¥ MERGEFORMAT] Rot TDOGPROFERTY
20087 jocument Number” \*
a MERGEFORMAT ]
iy
Dat 23420-Juh 87
UNCONTROLLED IF PRINTED [KEYWORDS \"MERGEFORMAT] ale, Hira

POL-BSFF-0223764_0116
Fujitsu

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

aw pus Apptcation Yi New No
‘Securly Windows
as ‘Antivirus Server windows I, I Now Yes
RHEL
BAL ¥ — BALSener ‘The Branch Access Layer servers. Application interstage New Yos
Server
Branch—Change  Sewer—for—the—Branch— Change Windows
acs ¥ Poca! Now Yes
System Server HNG-Xeuccessor to OCMS Management Server
RHEL
Branch Databace . race
‘The main branch database sont,
ee Appcation — § New Yes
Sever
RHEL
eos y Branch Database, Orace hen ee
Server
‘BAL Collis —and— monitors — matics 5
BMX ¥ Management —_inciding SLA Metics-exposed-bythe Operators! aie. New Yes
Colation Server BAL Sapet
Boot — Platform Estate Windows
HING-X Boot platform
BRL New Yos
Historical data copied from the branch ioe
Oracle
Rs fies database server, used-for-euppor. Application Now Yos
Keotesennes: Server
RHEL Backup RHEL_NelBackup media server & Storage —and
ax 3 RHEL Now Yes
es y¥ & ps : Solas I New Yes
Windows Backup Controls. the. backup of Windows Storage and Windows
Bs Sener servers under HNG-X. Backup 2003 Server I New ee
the
CAN OY. Other New No
HNG-X—NT4 The new counter eyelom for HNGX,
ary # = Application —Gounter PG I New No
crs, Cartficate Server New Yos
HNG-X XP The-new counter system for HNG-X,
op yt i Application Gunter PI New No
DEA ¥ Authorisation re = ‘pplication New Yes
3 TU Authorisation Agents 2003 Server
‘DNS——Sener Network
“The primary domain. nameserver
pee RHEL Now Yes
DNs Pecondan) Ne The-secondarydomainname server, eMart RHEL Yes
(secondary) Management
pss Demon ee i al wedows I Now Yos
DCopyright Fujisu Senvees Lid [SUBJECT ¥ MERGEFORMAT] Rot TDOCPROPERTY
20087 “Document Number” \*
MERGEFORMAT ]
Version: 0.482
é Dat 23428-JulyMayNow 08?
UNCONTROLLED IF PRINTED [KEYWORDS \"MERGEFORMAT] ale Bea

POL-BSFF-0223764_0117
Fujitsu

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

‘Server—thal_provides —aproxy—for
oxe - RHEL Now Yes
eysioms.
‘Server thal provides a__proxy_for
Internet Data Network
oxi exchange of data—beiweon—Dala RHEL New Yes
Exchange Prony Conte and Exerralinternetsyeteme,  Menagement
‘SYSMAN3-EAS provides the Realtime
‘Sysman alive Dashboard (RAD) layer Server
EAS ‘valabilty aan Sectors RHEL New Yes
ar Omnmtus gateway ——fhneient
setegiaton}
=e Sener Windows-and Solas servers Management Sole I New Ree
EMe Control
Storage—and Windows
Centre —Agents Controle the EMC storage,
ood Backup 2003-Sewver I New os
‘SYSMAN3_—_Database—repostory
service for TPM-and TOM —euppors
Inventory —,—acset—and software
See, dieibution and event-audicemvce Systems Windows
onal Enieprice , Woplaces—theS¥SMAn2—OMOS} Management 2003-Server NOW Ree,
“Thie-ie-the-OWH.2u4,-event-audting
database and TcMepostory.
‘SYSMAN3—EES— provides —the
Golection—Layer—Orenibus Object
Sener functional
Syemman
kes Enlerpiee Event Stowe ag I RSL Now Yes
Sener Gin Dales Setser_ eg ek 95 I Havegervon:
Omnibus gateway for Auciting.
(Equivalent io-clont -eorer TEC)
'SYSMAN3 mult-purpose server which
provides —the—evant—concentation
{unetion for events flowing —from
‘ndivdualplatforme inthe branch
slate “and data conte,
Syeman
: systems
_ ‘Software-depot or Provisioning si _ _ Soe
FanouSever framework Gateway (Qare—bgh
gateways, 2 fr-camopus-corvers-and
the est are branch (chert) gatewaye)
[TCM SAV_ depot, Eventing Proxy
Probe, RE Prox)

Sopyight Fujtsu Services Ld TSUBIEGT ¥ WERGEFORMAT] Rar [OCPROPERTY
20087 jocument Number” \*
= MERGEFORMAT]

0.
Dat 23i2-uh 8
UNCONTROLLED IF PRINTED [KEYWORDS \"MERGEFORMAT] Patel. Saari

POL-BSFF-0223764_0118
Fujitsu

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

EMD_provides-the_user-access-into
the-event-data-held-by-the-eyetom
smanagemment-solution The te inthe
— ‘agaregation——_——layor.
ae ed
His-an_Cmnibue-Objoct Senver_and
=MD fmiiie Wenn RHEE Now Yee
Dieplay ae een
Ws inthe aggregation layer.
{Equivalent —to—the —syeman2
‘mnasterTEC)
‘Syeman,
eM RHEL Now Yes
Sener
fone ye
EMS Esagpice fooneh} Nonegement RHEL New Yes
‘SYSMANS- server which provides the
Data Centre provisioning and software
Seman feoiiies seal Oran
prise indows Campus servers. ‘Systems
ala Provisioning Management RHEL tee ie
Sener {TPM_comver,—FEM_(Campue},-PXE
DHCP server, TPM 08 server,
Rembo insiance?)
ERP New Yeo
Loel ‘Server —for—Estate Management Esiaio ae
est Management 2003-—SQL_New Yeo
Management, Databases Management 2008
‘SYSMAN3 Aka —TEPS —Twol
Enterprise Portal server
‘Sysinan NWiskine
eu Enterprise User ™ prep New Yeo
: Management
4062)
“Supports —Key—management— Also
Hardware Security
us known —26—the— Networked Attala thee
¥ Securty Module Rowe Management Bare re
RHEL
Help Oesk Web Interstage
wws x I elect call logging,-as-a-vitual-platorm-on Application iors New Yos
(Virtual)
KIN -% — KMNG-Sener —- HNG-XKeyManagementsoner, — Saculity Miedows I New
KsN yy KMNG HNG-X Key Management Security ina No
Solari —9
Maestro Systems Server
‘ Maestro 6.0-Legacy Server
MSH ¥. Now Yes
ony:
‘GCopyight Fujisu Services Lia TSUBIECT ¥ MERGEFORMAT] Refs TDOGPROPERTY
20087 “Document Number" \*
MERGEFORMAT ]
0.
" Date 23423-Jul 8
UNCONTROLLED IF PRINTED [KEYWORDS \"MERGEFORMAT] Palen. 3G of 126

POL-BSFF-0223764_0119
Fujitsu

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

i

g

E

g

RGR

aie
Nas. :
. Now Yee
team.
Network Management “Allen”
Solis I New Yee
team
Fm
Security 9 : New Yeo
team.
Network
Network Management Sener{HNG- NetWork Saisie I aw 7"
(pees EY Management
Network
eee (HING-X) Management XP. Ls ise
NMS—-—Packet Network Management diagnostics Network Windows
Capture (HNG-X) Management 2003-Server I New ae
Runs the Traning-Onine transaction RHEL
Web-Server ‘962 -vrival_platiorm—on—the EOS Sener New Yoo
physical platform (Mirtual)
Process — Area
Network ee es I es init Yes
(. BladeFrame, Management
ach
Securty Windows
Management Now Yee
Management 2003 Server
FIN. Pad-Proving Operational
Workstation frteting PIN pode, Workstation I Now No
k mele Windows
Network 2008
Router users who-—manage-—network Now Yee
Management Radius
Radlue—Branch padiue a ——
Router Now Yee
Mention Aittenticalon Radive
Marre om
Fle Senver-used-by ROT for backup
RDTEle Sever data—and—storagel—tranefer—of Application — WiNdOWS I Noy Yes
‘reference data fies. aiid
EMC— Secure
. Now Yes
Manager
ROTLinw Branch Databose,_BAL—andAROR Greco
‘Server databases, used in reference data Database: bal ad
Server

POL00397094
POL00397094

‘BCopyright Fujisu Services Lid
20087

UNCONTROLLED IF PRINTED

[SUBJECT \ MERGEFORMAT]

[KEYWORDS \* MERGEFORMAT]

Ret

Dat 23i23-Juh
Page No: 121 of 126,

[DOGPROPERTY
“Document Number" \*
MERGEFORMAT ]

iy

POL-BSFF-0223764_0120
POL00397094
POL00397094

[TITLE \* MERGEFORMAT ] "
FUJITSU [SUBJECT \* MERGEFORMAT ]

[o> [ste [mee tet a oe am

Router ‘Supporis-the-branch- router Known
= Access —and Windows — I yay, ee
‘SupporSene: —_lesupport.the-branch-outerollout,
EMC—Secure
RSG Remote Support New Yes
Sie ‘equipment for support purposes Backup £2008 Server
‘ROT Main-Unix-database-sorver-used ‘Solar
RDF Solatie Host
oat inseference data-verfication, Application — Oracle hoe id
Redormance Database —that —holde—shor-term =
Windows
performance-measures-from-Aihene,
‘SPN Management pee andere I New Yee
38u SAS Seever Now Yes
‘Security Testing
sth ‘Server—Red Hat Testing RHEL New Yes
Linux
‘Security Testing
Windows
‘SenverWindows
sw Testing Bs vas New Yes
‘Securily Testing
six Testing Other New No
‘Sophos_—_Web
SWE ee I - Testing other New No
‘Network
SYSLOG Sewer Network event monitoring server
s¥S enernent I RHEL New Yee
‘Securty Test Test_only platormn Foundation only
co Red—Hat—Linux build-for Seculy testing of Red-Hat Testing RHEL New Yee
Foundation tinue
‘Seculy Test Test_only platormnFoundation only
Ts Solarie_—10_build-for-Seculity-Testing-of Solaris Testing Solaie New Yes
Foundation 40,
‘Test only_platform. Foundation_only
= Secutty oot! build for Secuity Testing.of Windows Testing bien Now Yes
‘Fest-only_plaform.—Foundation- only
‘Secusty Tost Windows
wx bud for SecurtyTesing of Windows I Tesing an Now No
mH oy B val I Ses RHEL New Yee
Vulnerability Securty Windows
al Scanning Sever Management  2003-Server I New Rae
, a ibn —wiich th 7 Windows
iia Host seners Un, Management — 2003-Server I NOW Bes
xes ¥ fH , 6 nn Testing New No
‘ACE Worksiation  Worketalion-used to manage Suppo
ACE ‘Securty Workstation I Retiring No
Estate
ace x A a Retiing Yon
BCopyright Fujisu Services Lid TSUBIECT ¥ MERGEFORMAT] Ref TDOGPROPERTY
2087 “Document Number” \*
MERGEFORMAT }
oa
7 Date 23423-JulyMayNow-087
UNCONTROLLED IF PRINTED [KEYWORDS \"MERGEFORMAT] ale, Siena

POL-BSFF-0223764_0121
Fujitsu

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

Pack
acs ee No
(orzo inchidethe-AutoGonfig-signingsener, PeHbulon — Gayyy
‘Runs-agents that move data between
databases —and correspondence
servers. NA_Sener
AGE ¥% Generic Agents Application Retiring Yes
‘Alco some stand alone online services
that use the-coxtespondence servers.
te Workstation to access—the Aut
AUD -¥% Workstation Application Workstation I Retiring No
(Hotzon} ‘casi
Solas —Backup Bleeatie ait
[: ‘Server to backup Solaris servers,
Bac ¥ 3 Solas I Reting Yo:
Windows
BK Retiring Yoo
Sewer (Horizon) servers.
Backup ee
‘The Boot Loader plaform-for SO2and
‘T60-spate_instalation (pon-Branch Access, “and I NT4” Server
aus Boot Loader Retiing Ys
‘The Boot Server platform for Horizon
B00 Boot Sones ' tee I eee Imaiing, I ies
domain contvoler EEE I SEER
Authority systems, Specialised hardware. Also Seourty
om Workstation Management #F6F ee IS
(Horizon) ‘Authority Workstation {CAW),
‘BveS——and
Sofware NT4_Senvor
ous Dimensions ——The-CMsigning-senver Retiing No
‘Signing Server Distribution Hydracniy)
cNH OY No
Correspondence Messaging server to-pass- messages NT4 Server
COR  Seners to-and fromcounters. ‘Applestion I aiydreonny I Retin I ee
U—Fost
‘SYSMAN2 FTU_Test_SYSMAN2-SMDB-Server
cou ed ro Testing Retiring No
(ena POG)
DCS_and_ETS Supports omine access to the- debit
DcA x ica Yes
(Horizon) for Horizon-courters only. ieee
wU_———Fost
‘SYSMAN2
Det Delivery Server No
(eamed using
code SYSDEL)
Domain
DOM ¥% — Conicollr Yes
(Horizon)
DCopyright Fujisu Senvees Lid [SUBJECT ¥ MERGEFORMAT] Rot [DOGPROPERTY
20087 “Document Number” \*
MERGEFORMAT ]
iy

UNCONTROLLED IF PRINTED

[KEYWORDS \* MERGEFORMAT]

Dat 23i23-Juh
Page No: 123 of 126,

POL-BSFF-0223764_0122
Fujitsu

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

Enterprise Server {ACE} runs the RSA
ext Solos I Rating No
{Ratring Horizon platforms ony)
TU-Test-Ghost
TU Teet Ghost Server

ost 5 Testing Retiing No
—

SYSMAN2

= Inventory Sener HU Tost SYSMANZ Ierton/ S66" reng oe
‘sing ——-cod '

Seshoy
ey — management workstation,

ii (Horizon) securty-team-to-manage Horizon Monagement Wotstaon I Retiring No

weve:

kus Retiing Yes
KMA— Adin. Adoinstave- workstation into_KMA

Secusty
KSA No
(orion) securtyteam, Management
‘Syemond Thal Goloway server tor
Lew Solaie I Retiing Yo
campus gateway
ru___Test
SYSMAN2 Twoll FTUTest _S¥SMAN2—Twol
Man Workstation Workstaton-—(vamed-—using code Testing Retiing No
SYSMAN)
Code SYSMAN).

pa ‘ppleation Rotting Yee
pecations

cM Me Me Mg Me (Hs Retiring: Yes
Syelom Server

ow ocus Workstation to 20cees the Operational Estate Wonton I Rating I We
Managomont

re Systems NT4_Server
SYSMAN2 OMDB archive server

oma Databere Retiing Yee
(OMDB)
seme ‘Sewer forthe Operation Management NT4_Server
liventory Server

me = Database (OMDB). Management (Hydraony) I Petting Yes
—

-" Provides one-time passwords for Operational te
Reseword enginesracceseioHotaon-courlers Suppo Wottstalon  Retking
rru___Test
SYSMAN2TwollFTU-Test SYSMAN2 Thol-PO Client

pow SEMAN Tht I 8 Fosting Retiing No
Gateway

“SCopyight Fuau Senices Ud TSUBIECT ¥ WERGEFORMAT] Rae TDOCPROPERTY

2008 ‘Sooument Number \*

MERGEFORMAT]
a

UNCONTROLLED IF PRINTED

[KEYWORDS \* MERGEFORMAT]

Dat 23i23-Juh
Page No: 124 of 126,

POL-BSFF-0223764_0123
Fujitsu

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

IN. Pad- Proving
Operational
pa Worksiaton Worketation frosting PIN pode, Workstation I Retiing No
Honor} ‘Support
Windows
SAS_Sewver provides controled acceee to-other I Seourty
ao) Trane 200 Cae Rating Yo
Hoszon serves,
SYSMAN2 Tivoli Enterprise Console ——-
‘scT ‘Sysman—Glient over to which events from-counters  SYS!ers eee Retiring ‘Yes
tee ot Management (vga
Only)
‘Syma Domala ‘Syelome —-NTASoner
soc SYSMAN2 domaincontolet Rotting Yes
ans Sener 'SYSMAN2 oftware detibution Destibution — (Hydra-oniy) I Petving Yes
‘SYSMAN2-TEC used for processing
= Expedted TEC Syaleme —menagement——actione—? I Management I Sede I Retling Yee
monitoring
Syeman
smo Retiing No
(SMB)
Jorkstation 8 roms iri
Mk Workstat KMS Help Deck Workstation Retiing No
‘Also known as-the Swing TMR forthe
Nt “branch eetate,
‘Sysman Master ‘Systems
Me Solos I Rating Yas
a] This is temporary — not part of, ienagerae
‘SYSMANG
‘SVSMANO—MacterTivol-Enterprice
sur ts Solate I Rating Yo
operator consoles .
Solas 9
Server
NT brs a Retiing Yes
Oni)
‘The—_Short-Term — Performance
seo Performance Rotting No
domain controler in-whch-i-the Suppo ydraans
Dolabane Sener domain cont
Syoman—_Post I SYSMAN2Horiaon Tio B310W0Y — Sgmy
iit Office Gateway ‘Management aoa Lica cs
Soae —0
= SYSMAN2 Two Entrpise Console . _
Tee Morogoro Cinta Sate
sora sears as
‘Syeman—Seoure  SYSMAN2Horizon—TwvolLgatoway
sso Post ——Offce sener—for—clont facing campus SIH Sotaig I Rating «Yow
Gateway ‘ndpoinis (in DMZ5) bitin
“SCopyight Fuau Senices Ud TSUBJECT ¥ WERGEFORWAT] Ret TDOGPROPERTY
2008? ‘Document Number \*
MERGEFORMAT }
04

UNCONTROLLED IF PRINTED

[KEYWORDS \* MERGEFORMAT]

Dat 23i23-Juh
Page No: 125 of 126,

POL-BSFF-0223764_0124
Fujitsu

[TITLE \* MERGEFORMAT ]
[SUBJECT \* MERGEFORMAT ]

POL00397094
POL00397094

[o> [stm ett ee oe

‘Sysman-Server TEC, also-known-as I Systems
st Syeman ste Sysman Serv Solaris I Retiring «Yas
‘Software —repostory—on which all
6 Staging Server force by-support-staf-for-eysiom Detibution — (Hydra.only) I Retina = Yee
builds
Solaris 9
MAN2_Hovizon Tivol pseece]
— Gateway server for Campus-endpoints Management (Hydra I Petting Yee
Only
ACSLS Server Storage ate
mM Robot contollor. : Solas IRating «Yas
i Horizon ___SYSMAN2___Tivol Server
— eregement Management Region-(TMR} server Management (Hydra bi I hata
aiid Only)
‘Supports vital private-network (VPN)
to Horzon— branches.—Montring
\VEN—Loopback ‘Acoees—and NT4_Senvor
vow ¥2 Workstation to-check avalabilty ofall Retiing Yas
Web Service endpoints,
‘Supports ial private network (VEN)
vex ve ¥ Rating Yow
{ine} VPN key) from KMA fo branches
‘Supports vitual private-network (VPN)
VEN ——Polioy Seen ore et nor un Access —and NTA Server
vem v2 ‘Server manages delivery of VPN Rating Yow
linked 1 KMA for key revocation.
‘Supports vitual pivatenetWorK(VPN) Aooocs ang I NT&_Server
VON ¥2  WBNSener ——_to-Horlzon-branches-The VN Sener Retiiag Yow
Provides-the main-routingturction. Authentication (Hydra-oniy)
‘ABOR Workstation delivered to Post Office to
aPw OY eset »{° Application Workstation Unchanged No
POLES__SAP
KO Archive Server
——
wwe ¥ Yes
Giient
POLES SAP
NWS ¥% Middleware SAR NetweaverMiddloware-sonver Application Solavie SAR Unchanged Yas.
Sener
par ]y I Gore. rer 3 on Unchanged No
POLES SAP
samt y I ROUES Yos
ROLES —SAP
sara x FOL Yes
DCopyright Fujisu Senvees Lid [SUBJECT ¥ MERGEFORMAT] Rot [DOGPROPERTY
20087 “Document Number" \*

UNCONTROLLED IF PRINTED

[KEYWORDS \* MERGEFORMAT]

Dat 23i23-Juh
Page No: 126 of 126,

MERGEFORMAT ]
iy

POL-BSFF-0223764_0125