Continuent Developer Community
Welcome, Guest
Please Login or Register.    Lost Password?
Re:Replication Breaking Under Load? (1 viewing) (1) Guest
Go to bottom Post Reply Favoured: 1
TOPIC: Re:Replication Breaking Under Load?
#740
Cody Payne (User)
Fresh Boarder
Posts: 18
graphgraph
User Offline Click here to see the profile of this user
Replication Breaking Under Load? 2 Months ago Karma: 0  
We are running tungsten-community-1.3-rc-2 with 2 nodes (one master and one slave w/ RW splitting enabled) Under our stress tests of 800 users concurrent we have seen a couple of cases where replication is breaking. Two examples are below...


Example 1:
pendingError: Event application failed: |
| com.mysql.jdbc.exceptions.MySQLN |
| Connection.close() has already |
| been called. Invalid operation in |
| this state. |
| pendingErrorCode: NONE |
| pendingErrorEventId: 000003:0000000047304340;0 |
| pendingErrorSeqno: 72892 |
| pendingExceptionMessage: com.mysql.jdbc.exceptions.MySQLN |
| Connection.close() has already |
| been called. Invalid operation in |
| this state. |
| resourceJdbcDriver: com.mysql.jdbc.Driver |
| resourceJdbcUrl: jdbc:mysql://decxpdb007:3306/${D |
| BNAME}?jdbcCompliantTruncation=f |
| alse&zeroDateTimeBehavior=conver |
| tToNull&tinyInt1isBit=false&allo |
| wMultiQueries=true&yearIsDateTyp |

Example 2:
[LOGICAL] /cluster/cluster2/> ls -l decxpdb007

COORDINATOR[decxpdb007:MANUAL:ONLINE]

ROUTERS:
+-----------------------------------------------------------------------+
|decxpdb006:10999:ONLINE |
|decxpdb007:10999:ONLINE |
+-----------------------------------------------------------------------+

DATASOURCES:
+-----------------------------------------------------------------------+
|decxpdb007(slave:ONLINE, progress=-1, latency=0.0) |
+-----------------------------------------------------------------------+
| activeConnectionCount: 28370 |
| appliedLatency: 0.0 |
|callableStatementsCreatedCount: 0 |
| connectionsCreatedCount: 14185 |
| dataServiceName: cluster2 |
| driver: com.mysql.jdbc.Driver |
| highWater: 2(000003:0000000047300711;0) |
| host: decxpdb007 |
| isAvailable: true |
| lastError: |
| lastShunReason: |
| name: decxpdb007 |
| precedence: 99 |
|preparedStatementsCreatedCount: 0 |
| role: slave |
| sequence: Sequence(0:0) |
| state: ONLINE |
| statementsCreatedCount: 0 |
| url: jdbc:mysql://decxpdb007:3306/${D |
| BNAME}?jdbcCompliantTruncation=f |
| alse&zeroDateTimeBehavior=conver |
| tToNull&tinyInt1isBit=false&allo |
| wMultiQueries=true&yearIsDateTyp |
| e=false |
| vendor: mysql |
+-----------------------------------------------------------------------+
+-----------------------------------------------------------------------+
|decxpdb007.corporate.connextions.net: EPLICATOR(role=slave, |
|state=OFFLINE:ERROR) |
+-----------------------------------------------------------------------+
| appliedLastEventId: NONE |
| appliedLastSeqno: -1 |
| appliedLatency: -1.0 |
| clusterName: cluster2 |
| currentEventId: NONE |
| host: decxpdb007.corporate.connextions |
|.net |
| latestEpochNumber: -1 |
| masterConnectUri: thl://decxpdb006/ |
| masterListenUri: thl://decxpdb007/ |
| maximumStoredSeqNo: -1 |
| minimumStoredSeqNo: -1 |
| pendingError: Event application failed: |
| com.mysql.jdbc.exceptions.MySQLN |
| Connection.close() has already |
| been called. Invalid operation in |
| this state. |
| pendingErrorCode: NONE |
| pendingErrorEventId: 000003:0000000047304340;0 |
| pendingErrorSeqno: 72892 |
| pendingExceptionMessage: com.mysql.jdbc.exceptions.MySQLN |
| Connection.close() has already |
| been called. Invalid operation in |
| this state.
|
| resourceJdbcDriver: com.mysql.jdbc.Driver |
| resourceJdbcUrl: jdbc:mysql://decxpdb007:3306/${D |
| BNAME}?jdbcCompliantTruncation=f |
| alse&zeroDateTimeBehavior=conver |
| tToNull&tinyInt1isBit=false&allo |
| wMultiQueries=true&yearIsDateTyp |
| e=false |
| resourceVendor: mysql |
| role: slave |
| seqnoType: java.lang.Long |
| sourceId: decxpdb007 |
| state: OFFLINE:ERROR |
| timeInStateSeconds: 299.571 |
| uptimeSeconds: 82980.335 |
+-----------------------------------------------------------------------+
+-----------------------------------------------------------------------+
|decxpdb007ATASERVER(state=ONLINE) |
+-----------------------------------------------------------------------+
| state: ONLINE |
+-----------------------------------------------------------------------+

We are able to bring them back online no problem w/ the cctrl command replicator decxpdb007 online but why is it breaking in the 1st place?! These servers have 16 cores and 128 GB of memory. And chose "large" on the install of Tungsten

Thank You!


Cody
 
Report to moderator   Logged Logged  
  The administrator has disabled public write access.
#742
Cody Payne (User)
Fresh Boarder
Posts: 18
graphgraph
User Offline Click here to see the profile of this user
Re:Replication Breaking Under Load? 2 Months ago Karma: 0  
Just happened again this time w/ only a 350 user test?

|decxpdb007.corporate.connextions.net: EPLICATOR(role=slave, |
|state=OFFLINE:ERROR) |
+-----------------------------------------------------------------------+
| appliedLastEventId: NONE |
| appliedLastSeqno: -1 |
| appliedLatency: -1.0 |
| clusterName: cluster2 |
| currentEventId: NONE |
| host: decxpdb007.corporate.connextions |
|.net |
| latestEpochNumber: -1 |
| masterConnectUri: thl://decxpdb006/ |
| masterListenUri: thl://decxpdb007/ |
| maximumStoredSeqNo: -1 |
| minimumStoredSeqNo: -1 |
| pendingError: Event application failed: Unable |
|to insert event into storage |
| pendingErrorCode: NONE |
| pendingErrorEventId: 000003:0000000049949445;0 |
| pendingErrorSeqno: 77719 |
| pendingExceptionMessage: Unable to insert event into |
|storage |
| resourceJdbcDriver: com.mysql.jdbc.Driver |
| resourceJdbcUrl: jdbc:mysql://decxpdb007:3306/${D |
| BNAME}?jdbcCompliantTruncation=f |
| alse&zeroDateTimeBehavior=conver |
| tToNull&tinyInt1isBit=false&allo |
| wMultiQueries=true&yearIsDateTyp |
| e=false |
| resourceVendor: mysql |
| role: slave |
| seqnoType: java.lang.Long |
| sourceId: decxpdb007 |
| state: OFFLINE:ERROR |
| timeInStateSeconds: 331.416 |
| uptimeSeconds: 153360.728 |
+-----------------------------------------------------------------------+
+-----------------------------------------------------------------------+
|decxpdb007ATASERVER(state=ONLINE) |
+-----------------------------------------------------------------------+
| state: ONLINE |
+-----------------------------------------------------------------------+
 
Report to moderator   Logged Logged  
  The administrator has disabled public write access.
#743
Gilles Rayrat (User)
Senior Boarder
Posts: 45
graphgraph
User Offline Click here to see the profile of this user
Gender: Male Location: Grenoble, France
Re:Replication Breaking Under Load? 2 Months ago Karma: 2  
Hi Cody,
Can you have a look at the replicator logs and copy paste any error or warning you see around the time of the error?
Thanks,
Gilles.
 
Report to moderator   Logged Logged  
 
Gilles.
  The administrator has disabled public write access.
#744
Jonathan Sharley (User)
Fresh Boarder
Posts: 4
graphgraph
User Offline Click here to see the profile of this user
Re:Replication Breaking Under Load? 2 Months ago Karma: 1  
We're theorizing that the direct connects for the read/write splitting overloaded the capacity of the slave. During high load even mytop wouldn't reliably connect via the socket. Changing the MySQL back_log parameter to 500 from the default value of 50 seems to have mitigated the problems at the moment.
 
Report to moderator   Logged Logged  
  The administrator has disabled public write access.
#745
Gilles Rayrat (User)
Senior Boarder
Posts: 45
graphgraph
User Offline Click here to see the profile of this user
Gender: Male Location: Grenoble, France
Re:Replication Breaking Under Load? 2 Months ago Karma: 2  
OK, thanks for the input.
Please don't hesitate to report back if you see any enhancement that we could do following these tests.
Thanks,
Gilles.
 
Report to moderator   Logged Logged  
 
Gilles.
  The administrator has disabled public write access.
#766
ab dh (User)
Fresh Boarder
Posts: 3
graphgraph
User Offline Click here to see the profile of this user
(Removed by administrator) 3 Weeks, 2 Days ago Karma: 0  
 
Report to moderator   Logged Logged  
 
Last Edit: 2010/08/16 12:18 By robert.hodges@continuent.com.
  The administrator has disabled public write access.
Go to top Post Reply
Powered by FireBoardget the latest posts directly to your desktop