EaaSI Development Update - December 2022

Due to staffing concerns - including the departure early in November of our Program Manager - development work slowed as the team addressed a number of logistical, administrative, and personal issues. However, work was completed on shared understanding of the development roadmap, and all priority roadmap items have at least drafted a RFC for implementation. This will speed onboarding of a new Program Manager and allow myself, OpenSLX, and Dual to continue making progress toward targeted EaaSI goals and a new tagged release in the new year.

The immediate priority for dev support has shifted to a long-planned upgrade of Yale’s “production” instance of EaaSI. This server, which hosts the environment and software resources that have so far seeded much of the EaaSI Network, including public environments on hosted.eaasi.cloud, has remained on the 2019.11 release of the EaaSI platform. Bringing Yale’s primary EaaSI node in line with the most recent, recommended/supported version of EaaSI will greatly streamline planned adjustments in the Development Roadmap to EaaSI’s metadata model and access control.

  • Dev planning summary
  • Recent work accomplished (September - December 2022)
    • coordinated with Yale Library IT to set up necessary infrastructure for testing Yale upgrade from 2019.11 to 2021.10
    • completed proof-of-concept interface for scripted/automated migration-via-emulation
    • fixed issues with front-end handling of spaces in file names for objects (Software and Content)
    • debugged reported issues with environment-proposer in experimental UVI API
    • debugged reported issues with running out of socket descriptors on low-resourced EaaSI deployments (determined issue is not unique to EaaSI, is present in all EaaS deployments; fixed corresponding resource leaks in EaaS back-end)
    • fixed migration path for running OAI-PMH synchronization on resources originally created in deployments 2019.11 or earlier to deployments 2021.10 or later
  • Forum tickets/bugs resolved
    • n/a
  • Expected work (December - January 2022)
    • resolve resource leakage in third-party EaaS libraries leading to overload of socket descriptors and related performance issues (present upstream across all EaaS installations)
    • complete copy of data from Yale’s production server to test instance and begin testing eaasi-installer migration paths
    • complete migration of user permissions/ownership information from front-end to back-end object archive API
    • debug reported issues with environment-proposer endpoint in experimental UVI API
    • resolve reported security vulnerability in Keycloak versions < 13.0.0 (debug failing automated Keycloak database migrations to allow for upgrade to most recent versions of Keycloak)
    • tweak eaasi-installer to account for MinIO project dropping compatibility with fetching objects from S3-capable cloud storage providers
1 Like

Does the above refer to how SheepShaver is leaking then causing EaaSI memory to fill up then fail? Or?

Thanks!!
Cynde

1 Like

Hi Cynde! No, I probably should have put a little more detail on that particular point: this issue is particularly apparent on cloud deployments of EaaS/EaaSI and has to do with the server not properly closing socket connections between the client and server in a timely manner after running an environment; I believe the particular third-party library involved is WebRTC, so especially with environments with WebRTC audio enabled, even after shutting down an environment session, sockets assigned for the WebRTC client->server connection remain open (even though the original client request - the environment running in the UI - is gone). On dedicated server/single-machine installations, Linux usually does a good job of just periodically clearing out unused socket connections and thereby freeing up the associated RAM/CPU for those processes, which is probably why this hasn’t come up before, but in a cloud deployment where CPU cores and RAM are assigned on-demand, this leakage of unused sockets seems to cause RAM and CPU to stay tied up for longer, which leads to cascading issues when users try to do just about anything else and there’s no more dynamic RAM/CPU available. A work-around/fix is coming.

So that would be a separate issue with SheepShaver in particular leaking memory - apologies, I don’t recall hearing about that before, if you can make a post with a little more detail in the Support Center we can try to get it in the queue to take a look!

1 Like