FUJ00075614 - Peak Incident Management System - Lost transaction following SCO replacement

Evidence on official site

FUJ00075614

FUJ00075614

Peak Incident Management System

IPCO052823

_ {Deleted User -- Deleted Team

Targeted At -- BL 2

PWY WP 12903

Product Incidents/Defects

B -- Progress stopped

Deleted Contact

IClosed -- S/W Fix Available to Call Logger:

24/08/2000

[Lost transaction following SCO replacement

ae

‘Fast track fix

ESTK_2_0_WP12000

Work Package

PWY WP 12345

[Work Package

PwY_WP_ 12000

‘ast track fix

IPSTK_2_0 WP12145

PY WP 12903

target
host t
Releas:

Iso tha
quirr

wou

Ipres:
nas noi

jorigin:

Rele

r-Cele:

Following rollout of a celest

med
yup t
syn
0 Ex

he Edrive from

the

el.
mining
joBcs transaction performed before
counte:

r MsSt
1d app:

ge processor came out o

remaining messag
synchronised. The net result appears to be that all original message

ntl

Ww been
Althou:

tached extract is all node id 1

al mes:

lat approx 16:4

CALL PC0052823:Prio:

E messaging sw
Unknown

acement

befor:
© message number 660 from node 1
chron

om node 1 existed only in the C
re! message store at the

he origion
initial MsStore sync!

the before and after balance snapshots, it was clear that an
p-out had now vanished from the

ore.

ear that after the recovery from

generated, the server switched back to recovery mo:
the F: drive mirror message store above

he C: d m prior to the
overwritt
gh unconfirmed ,

laining the
sages
large).
prox

le if required but
sage 661 was written at

areth Jenkins viewed this error on rig with
y C:CallType P - Target 29/08/¢

21-Aug-2000 12:51:00 User:Deleted User (Robert Hillyard feb01)
CALL PC0052823 opened
References en!

added

CI3_2R2 to cra

a built SCO several transactions were
was replaced with a triage

whic!

sed to the correspondence server. 5
drive message store and the
ime of triage ewap.

counter was placed into the replac
ronisation would come from

the

very mode after s

ter then wrote a Riposte on-line message as 661
g to syn

ent 5

660 for node id 1, Suspect this was by rep.
correspondence serv ou

for node 1 before 1 second later attempti

lnirror message store, At this point a red e
Inessage' was

iage

n by the ‘on-line’ message immediately following the

Eve:
6:02.

Mike

e of
fh was a priority OBCS stop

wever further messages

nt counter
e Fdrive

squirrel completed ,
ronising up to message

he

lication with the

chronise to the F: drive
egarding ‘self originated
nd the

are
which

p apart from 6

ounter message st
logs also attached.
New message 661 was written

(full

Berrisford

New ev.
New ev.

‘idence
idence

added - Node id i m
added - Event

ssages from
file .evt

Date:21-Aug-2000 12:53:00 User:Deleted User (Robert Hillyard feb01)

ter messagestore

} Res
fac In:
r don'

catego

Defect
Hours

ponse
fra. Dé

ry tc!

spent

bate :23-Aug-2000 12:47:00 User:Dave Royle

ev (0:

Gareth initia

eve this to be a only ¢

21305532
type P as Category 30 -TL
was delivered on
ord has been transferred to the
pdated to 99
since call receiv

the system

neral - Unknown
ed: 0.2 hours

unter issue,
sco replacement issue in current live, and on that basis suggest a CS

Team:

OFP

Date :23-Aug-2000 15:33:00 Use
the Call record h
Hours

erred to the Team: Infr
0 hours

truc-Dev

FUJ00075614
FUJ00075614

pate :23-Aug-2000 15:34:00 User:Lionel Higman
Ithe Call record has been assigned to the Team
Hours spent since call recei’ © hours

25-Aug-2000 13:56:00 User:del(01/01 Denise Jackson)
authorised categorisation B

ase updated to MRB

have been updated. They are no’

other : B

Date:31-Aug-2000 13:59:00 Use
seeking further i

e1 (04/03 Brian Orzel)

04-Sep-2000 12:37:00 User:Del1(04/03 Brian Orzel)
being retested this week

Date :04-Sep-2000 14:14:00 User:Del(04/03 Brian Orzel)
x was not aware that a single neighbour was acceptable to c
bly cause sel

recovery mode. This will inevit
land needs to be fixed,

JAfter a chat with Gareth, I don't think we have a sound implI
chis area, but any change will require careful thought and design.
ing it to the TDA stac
with him and Mark

originated mes:

f.a.0. Gareth, and ai

transfer

Pix could either be
Brian Orzel.

whe Call record
Hours spent since call

le or UK code, or bi

ferred to the Team: TDA

bate :05-Sep-2000 08:00:00 User:Gareth Jenkins
he Call record has been assigned to the Team Member: Gareth Jenkins
ent since call received: 0 hours

Date:19-Oct-2000 11:13:00 User:Gareth Jenkins

IA similar instance of this has occurred on Live (PinICL 51255).
chat case recovery took place from a counter rather than the CS
failure.) That has been closed and this PinICL is being used to

underlying
Gareth

‘oplem.

Date:01-Dec-2000 17:03:00 User:Lionel Higman
Updates agreed at tdaqfp (JD/JMcL/LMH)

Irarget Release updated to DTL - unknown

he call referenc ve been updated. They are now:~
Other : B
@ other

utu:

11-Dec-2000 17:53:00 Use
re another ex

areth Jenkins

© also 58435 £1

careth

Date:19-Dee-2000 15:47:00 Use
Janother duplicate is 58686 (wh

sareth

Date:20-Dec-2000 16:21:00 User:Deleted User
RMP 20/12/00- Fix at CI4R ASAP (January - Week 1). Ple

Date :03-Jan-2001 11:49:00 Use

de1(05/01 John McLean)
Call record has been assigned to the Team
ived: 0 hours

Hours spen’

03-Jan-2001 15:06:00 User:Del(06/01 Peter Morgan)
lf} Response
onded to call type P as

Incident

the response was delivered on the system

FUJ00075614
FUJ00075614

Date:09-Jan-2001 16:43:00 User:Glenn Stephens
transfering to infrastructure- who have the specif
change from the tda

Ithe Call record has been transferred to the Team: Infrastruc-Re
Defect cause updated to 7 :Design - High Level be

Hours spent since call received: 1.0 hours

Date :14-Feb-2001 07:17:00 User:Karen Morley
rd has been assigned to the Team Member
+1 hot

Roger Goldring

since call received s

:02-Mar-2001 20:52:00 User:Lionel Higman
target Release updated to CI4

Date :26-Apr-2001 13:39:00 User:Lionel Higman
Raised priority at request of RMF.
CALL PC0052823:Prio: CallType P - Target 24/08/00 13:

B

Date :04-May-2001 14:01:00 Use
IF} Response

the removal of the writing of $RiposteHeartbeats messages at
resolved this problem, as it appears that the Messages from the MirrorDisk
that were being over-written were of this type.

However, this is not certain,

(END OF REFERENCE 25985328

esponded to call type P as Category 40 -Inc:
ponse was not delivered to external mailer as email addre

21 (06/01 Peter Morgan)

T4M1 MAY have

dent Under Investigat

s is invalid

Date:09-May-2001 15:37:00 Use
7} Response

oger Goldring

had a chat with G
currently in progress would still be needed.

[END OF REFERENCE 26018499

Responded to call type P as Category 40 -Incident Under Investigation
not delivered to external mailer as email addre

sth Jenkins, and he suggested that the design work

© invalid

he response wa:

Date:11-May-2001 08:49:00 User:Angela Shaw
this fix needs bringing forward before $10 for this problem. There have been
5 reported occurrences of this problem type recently, which causes

ceipts and payments mismatches. Every occurrence gives rise to great
concern with PON in that we cannot fully reconcile at the outlet, & there
lnay be also knock on eff tions may be lo:
pix needs to be brought forward as a matter of uregency. Thanks

20

ier

verabley.

bate:11-May-2001 09:50:00 User:Lionel Higman
Ihe call refer ve been updated. They are now:~

Date :21-May-2001 12:04:00 User:De1(06/01 Peter Morgan)

INew evidence added - ZIP file containing FAD 84102 M/Store, the overwr
IF} Response

JAt CI4M1, the messages in Collection $RiposteHeartbeat are no longer used,
Jand so these will not over=w ghbour, if Riposte
becomes available before it SHOULD be available (because everything has not
been passed over from a neighbour, usua SCO) .
However, data in RiposteVersionstring is still overwriting data, and thi,
cause a Cash Account misbalance.
see the attached, that happened a
Pull MessageStore

event log
overwritten messa
Please forward to QFP for the attention of
[END OF REFERENCE 26145993

Responded to call type P as

ed from a n

messages p

ly the Mirror disc on an

D 084102 on 18/5/01,

is being an sco

from the Mirror dis
reth Jenkins

1) retriev

ategory 40 -Incident Under Investigation
ailer as email address is invalid

response was not delivered to external

Date :04-Jun-2001 08:53:00 User:Angela Shaw
Peter, can you please progress as per Peter M's last update. These types of

enario are costing Pathway considerable amnounts of money when this
t £100/txn to PON, Thanks

happens, as we are laible

Date:0S-Jun-2001 16:58:00 User:Roger Goldring
F} Response

Rollouts
est
thi

ease note that on our limited

nch exe module has been changed. P
ing facilities the original problem of lost transactions, as described in
PinICL, is not reproducible. that the
failing situation be set up and demonstrated {using the existing Rolloutsynch
luodule), and that installing the new RolloutSynch.exe module fixes the
jproblem. The new module tackles the problem by eliminating the con

under which lost transactions occur - through manipulation of the Riposte
Recovery Neighbours parameter.

[END OF REFERENCE 26330690

IResponded to call type P as Category 46 -Product Error Fixed

The response was not delivered to external mailer as email addre
the Call record has been transferred to the Team: Dev-Int-Rel
lbefect cause updated to 4
Hours spent since call received: 74.0 hours

t is considered essentia

ons

invalid

General - in Procedure

FUJ00075614
FUJ00075614

Date:06-Jun-2001 09:35:00 User:Miho Fujii
‘ne call referen ve been updated. They are now:-
I? Work Package : PWY_WP_12000

Date:06-Sun-2001 15:44:00 User:Miho Fujii
the call s have been updated. They are now:~
lvwork Package : PWY_WP_12000

st track fix : FSTK_2_0.1

000

st.

type P as Category 60 -S/W Fix Released to Ca
since call received: 0 hours
jonse was not delivered to external mailer as email address

Logger

invalid

Date :06-Jun-2001 15:46:00 User:Miho Fujii
ilable, please .
rd has been transferred to t!
Hours spent since call received: 0 hours

e Team: BTC Rel Mig

Date :14-Jun-2001 17:54:00 User:Dave Royle
the Call record has been assigned to the Team Member
Hours spent since call received: 0.5 hours

Dave Royle

Date :13-Jun-2001 14:54:00 User:Dave Royle
lf} Response :

Lionel, Can you retarget this at
(12000) will need to be withdrawn and an

. The S03R

. ype P as Category 40 -Incident Under Investigation
Ithe response was not delivered to external mailer as email address

inv

Date:13-Jun-2001 14:57:00 User:Dave Royle
he Call record has been transferred to the Team: QFP
Hours spent since call received: 0.5 hours

Date :13-Jun-2001 15:07:00 v:
target Release updated to CI4si0

the Call record has been transferred to the Team: Infrast
Hours spent since call x

Jionel Higman

ceived: 0 hours

ate:14-Jun-2001 19:09:00 Use
Ihe Call record has been a
Hours spent sin

:Karen Morley
igned to the Team Member:
e call received: .1 hours

Date :06-Jul-2001 09:14:00 User:Roger Goldring

the call references have been updated. They are now:-

ork Package : PWY_WP_12000

t track £ix : PSTK 2 0_WP12000

livork Package : PWY_WP_12145

IF} Response

problem has been addressed by adding a capability of setting the number of
Recovery Neighbours to RolloutSynch; the work package is WP12145.

Please note the caveat mentioned above, repeated here for convenience:
JRolloutSynch exe module has been changed. Please note that on our limited
testing facilities the original problem of lost transactions, as described in

hiv

PinICL, is not reproducible. It is considered

failing situation be set up
odule), and that installing the new Rolloutsyn
Jproblem. The new module tackles the problem by eli
lunder which lost transactions occur - through mani
Recovery Neighbours parameter.

(END OF REFERENCE 26871848

Responded to call type P as Category 48 -Fix Relea’
he response was not delivered to external mailer
fhe Call record has been transferred to the Team

Hours spent since call received: 3.0 hours

essential that

exe module fixes
minatin
pulation of the

sed to
as email address
Dev-Int-Rel

nd demonstrated (using the existing RolloutSynch

ditions
Riposte

invalid

FUJ00075614
FUJ00075614

Date:06-Jul-2001 09:53:00 User:Miho Fujii
ne call referenc ve been updated. They are no’
lwork Package : PWY_WP_12000

Past track fix : FSTX_2_0_WP12000

Work Package : PWY_WP 12145

wre

Date:31-Jul-2001 11:01:00 User:Miho Fujii
he call references have been updated. They ar
jork Package : PWY_WP_12000

Past track fix : FSTX_2 0 WP12000

age : PHY_WP 12145
st track fix : FSTK_2 0 I
P} Response

past track available, pleas
END OF REFERENCE 27151818
Responded to call type P as ©. 60
Hours spent since call received: 0 hours

e no!

litork P:

t deliver

© external mailer

Fix Released to

WE

as email address

ager

invalid

Date :31-Jul-2001 11:02:00 Use
Past available, please
whe Call record has been tran
Hours spent since call received: 0 hours

BIC Rel Mig

Date :10-Aug-2001 10:00:00 User:Dave Royle
assigned
received

1 hours

to the Team Member

teve Bansal

Date:07-Sep-2001 12:07:00 User:Mike Berrisford
Fix did not work as required. See attached extract
Jcareth / Tivoli wrappi: am giving explanations
Iss suggested by Gareth in response to the mail fur
required to the des
jon Monday 10/09/01.
Routing back to Inf dev pending disc
INew evidence added — Word document ext
record has been transferred to the Team:
ent since call received: 1 hours

of the fix which he is inte:

act

°:
nding to discuss

ssions with Gareth.

il dese:
Infrastruc-Dev

en myself /
ns.
with Karen

ping problem

12-Sep-2001 07:50:00 User:Lionel Higman
sed with Kare
etting target release as such.
arget Release updated to BI_2

Morley, the correct release for an i

astructure fix

Date:17-Sep-2001 06:59:00 User:Karen Morley
whe Call record has been a!
Hours spent since call rec

+1 hours

igned to the Team Member: Roger Goldri

ng

Date:06-Nov-2001 14:25:00 User:Roger Goldring
ne call references h
hivork Pa PHY_WP_12000

Past tr, : FSTX_2 0 _WP12000
jivork Package : PWY WP 12145
it fix : FSTK 2.0
work Package : PWY_WP_12903
5
Problem fixed in WP12903
(END OF REFERENCE 28062744
type P as
not delivered to external n

wP12145

ler

ategory 48 -Fix Relea:

ve been updated. They are now:~

sed to
as email address

inval

Date :06-Nov-2001 14:26:00 User:Roger Goldring
Ithe Call record has been transferred to the
Hours

spent since call received: 1.0 hours

FUJ00075614
FUJ00075614

They are now:-

livork Package +
Past track fix wP12000
work Package

pate :29-Nov-2001 18:25:00

2903.

'e1 (01/03 Ajay Nehra)
ease route to call log

er when processed by

Pix released in PWY_WP_

Date:30-Nov-2001 13:24:00 User:Miho Fujii
le} Response

track available, please
OF REFERENCE 28298997
Responded to call type P a

duct Error Pixed

s Category 46 -Pr

bate :30-Nov-2001 13:25:00 User:Miho Fujii
The response was not delivered to external mailer as email address is invalid
the Call record has been transferred to the Team: BTC Rel Mig

Hours spent since call re

0 hours

Date :30-Nov-2001 13:36:00 Use
all record h

ave Royle
ned to the Team Member: Dave Royle
: 0.5 hours

he s been a

Hours spent since call recei’

Date:11-Dec-2001 12:33:00 Use
pave, the customer had been advised that this problem had been fixed, so
would it please be possible to put this through testing & get the new WP sent
down to the counters as scon as possible. Pathway have to pay for these
incidents, plus will have to explain this reoccurrence. Thank:

:Angela Shaw

Date :20-Feb-2002 16:38:00 User:Dave Royle
IF} Re
Mike, Based on your extensive experience at $10, can
with the changes introduced at BI2 (sorry!). Ta Dave
(END OF REFERENCE 28941671

Responded to call type P as Category 40 -Incident Under Investigation
Ithe response was not delivered to external mailer as email addres:
Ithe Call record has been assigned to the Team Member: Mike Berrisford
Hours spen’ 1 hours

ave a lock at this

invalid

since call recei

Date:24-Mar-2002 15:51:00 User:Mike Berrisford
Have attempted retest using the new BI2 base units, Unfortunately we are
currently only at the stage where these base units auto regress te BI2a.
The full solution to this problem is two pronged. Rolloutsynch has changed to
lexecute a gradual reduction of RecoveryNeighbors to a minimum of 1,
based, to give the counter time to synchronise with as many ne
possible before coming on line whilst at the same time ensuring that
hbors is not contactable at he bex won't hang

time

bors as

£ for

son 1 or more nei

lany re:

itself.
hbo
that it

determined from

s in the

this works in conjunction with a new hard setting of RecoveryNe
IRiposte reg ue that indic to Rips
ld attempt synchronising with all available neighbors as
the Neighbors registry parameter).

lnfortunately this new value of 32 is associated with the BI2b counter
ers auto regress to BI2a (where the p
ste to default to a value of 1) we are unable
1 hold on stack until we can attempt a BI2b box swap.

ry to 32 (a speci

sho

baseline. As our co
lpresent le.
this at this time,

rameter is no

ving test

Date :09-May-2002 09:51:00 User:De1(04/03 Ray Fenwick)
Ihe Call record has been assigned to the Team Member: Steve Bansal
ed:

Hours spent since call re: hours

©:09-Sep-2002 15:41:00 Usor:Del(04/03 Steve Bansal)
of BI3 cyclel g wap on 120817.
dito 2

nario from abo
n 1 and finaly

FUJ00075614
FUJ00075614

Development - Reference Data

_IDeleted User -- Deleted Team

_Jintrastructure -- RIPOSTE messaging sw (version unspecified)

Deleted User -- Deleted Team

I09-Sep-2002 16:18 -- Del(04/03 Ray Fenwick)